Stanford SAFE - Designing Interactive Modules to Introduce High-Ranking Decision Makers to Technical AI Fundamentals
Mentors: Felix Krückel, Duncan Eddy
Project area: Technical policy work
Project Language
Minimum Time Commitment
10 hours per week.
Project Abstract
The primary goal of this project is to bridge the significant communication gap between technical AI safety research and high-level policy decision-making. As AI capabilities accelerate, policymakers often lack the intuitive understanding of machine learning (ML) fundamentals necessary to draft effective regulations. This project addresses this by developing interactive, web-based applications—specifically utilizing the Streamlit framework—that allow stakeholders to touch the results of fundamentals AI results, allowing them to grasp the concepts in a reasonable timeframe.
Theory of Change
Bad frameworks produce bad decisions. The question of machine moral status will increasingly affect AI development and governance. Currently, most people reasoning about it lack adequate conceptual tools. This matters for catastrophic risk in several ways.
Under-reaction: if AI systems develop welfare-relevant internal states and we lack frameworks to recognize this, we may create systems with misaligned interests while dismissing their signals as "mere computation." A system that experiences something like suffering under certain conditions, and whose operators dismiss this, is a system with reason to deceive.
Over-reaction: anthropomorphizing systems that lack morally relevant properties wastes attention and resources, and may constrain beneficial AI development without corresponding benefit.
Poor discourse: without shared conceptual foundations, public debate about AI consciousness polarizes between dismissive and credulous positions. Neither serves good governance.
The primer addresses these by training researchers and practitioners to reason carefully across multiple frameworks, recognize what each assumes, and navigate uncertainty without false confidence. The German focus (incorporating European philosophical traditions, piloting with German-speaking users) builds SAIGE's national infrastructure while contributing to the broader field.
Conceptual clarity is infrastructure. This project builds it.
Desired Mentee Background
Computer Science/ML, Maths, Economics, Law, International Relations, Political Science.
Desired Mentee Level of Education
Undergraduate and above. Must have taken a course that covers ML basics or take an ML course during the semester they work with me on the project.
Other Mentee Requirements
Required:
- Good grades or other indicators of excellence such as projects that achieved meaningful KPIs (detail what you did and specify/quantify how you succeeded please)
- Interest in AI policy
Preferred:
- Understands Machine Learning Basics
- Has regular contact with students in law/policy/IR (or is one)
- Experience creating teaching formats
- Highly skilled at using agentic systems and verifying their outputs
- Solid Python programming skills (Python, Package Management, Git)