Stanford SAFE - Designing Interactive Modules to Introduce High-Ranking Decision Makers to Technical AI Fundamentals
Mentors: Felix Krückel, Duncan Eddy
Project area: Technical policy work
Project Language
English or German.
Minimum Time Commitment
10 hours per week.
Project Abstract
The primary goal of this project is to bridge the significant communication gap between technical AI safety research and high-level policy decision-making. As AI capabilities accelerate, policymakers often lack the intuitive understanding of machine learning (ML) fundamentals necessary to draft effective regulations. This project addresses this by developing interactive, web-based applications—specifically utilizing the Streamlit framework—that allow stakeholders to touch the results of fundamentals AI results, allowing them to grasp the concepts in a reasonable timeframe.
Theory of Change
The trajectory of transformative AI poses global catastrophic risks that can only be mitigated through precise, technically-grounded policy. However, a significant "literacy gap" exists at the highest levels of governance. In my and my colleagues' interactions with high-ranking AI policymakers in the EU and US, we discovered that many decision-makers have never directly interfaced with AI models. Instead, they rely on anecdotal evidence and media narratives, lacking the basic scientific fundamentals required to effectively model the systems they intend to regulate.
This project operates on the premise that effective policy—even at the highest level of goal-setting—requires a conceptual "mental model" of the technology’s failure modes. By providing interactive Streamlit applications, we move stakeholders from passive consumers of information to active participants. When a policymaker personally trains a classifier and observes its total collapse under a simple adversarial attack, the abstract risk of "brittleness" or "alignment failure" becomes a tangible reality.
Our contribution to mitigating catastrophic risk is twofold:
- Precision in Governance: Reducing the "Goal, Technical Implementation, Agency level enforcement" disconnect when designing policies by ensuring the "Why" (high-level goals) is technically feasible, proportional and addresses actual architectural vulnerabilities.
- Resilience Against Hype: Empowering leaders to distinguish between superficial safety claims or "lobby lies" and robust technical benchmarks, thereby accelerating the implementation of safeguards against existential threats.
Desired Mentee Background
Computer Science/ML, Maths, Economics, Law, International Relations, Political Science.
Desired Mentee Level of Education
Undergraduate and above. Must have taken a course that covers ML basics or take an ML course during the semester they work with me on the project.
Other Mentee Requirements
Required:
- Good grades or other indicators of excellence such as projects that achieved meaningful KPIs (detail what you did and specify/quantify how you succeeded please)
- Interest in AI policy
Preferred:
- Understands Machine Learning Basics
- Has regular contact with students in law/policy/IR (or is one)
- Experience creating teaching formats
- Highly skilled at using agentic systems and verifying their outputs
- Solid Python programming skills (Python, Package Management, Git)