SAIGE

Stanford SAFE - Designing Interactive Modules to Introduce High-Ranking Decision Makers to Technical AI Fundamentals

Mentors: Felix Krückel, Duncan Eddy
Project area: Technical policy work

Project Language

English or German.

Minimum Time Commitment

10 hours per week.

Project Abstract

The primary goal of this project is to bridge the significant communication gap between technical AI safety research and high-level policy decision-making. As AI capabilities accelerate, policymakers often lack the intuitive understanding of machine learning (ML) fundamentals necessary to draft effective regulations. This project addresses this by developing interactive, web-based applications—specifically utilizing the Streamlit framework—that allow stakeholders to touch the results of fundamentals AI results, allowing them to grasp the concepts in a reasonable timeframe.

Theory of Change

The trajectory of transformative AI poses global catastrophic risks that can only be mitigated through precise, technically-grounded policy. However, a significant "literacy gap" exists at the highest levels of governance. In my and my colleagues' interactions with high-ranking AI policymakers in the EU and US, we discovered that many decision-makers have never directly interfaced with AI models. Instead, they rely on anecdotal evidence and media narratives, lacking the basic scientific fundamentals required to effectively model the systems they intend to regulate.

This project operates on the premise that effective policy—even at the highest level of goal-setting—requires a conceptual "mental model" of the technology’s failure modes. By providing interactive Streamlit applications, we move stakeholders from passive consumers of information to active participants. When a policymaker personally trains a classifier and observes its total collapse under a simple adversarial attack, the abstract risk of "brittleness" or "alignment failure" becomes a tangible reality.

Our contribution to mitigating catastrophic risk is twofold:
- Precision in Governance: Reducing the "Goal, Technical Implementation, Agency level enforcement" disconnect when designing policies by ensuring the "Why" (high-level goals) is technically feasible, proportional and addresses actual architectural vulnerabilities.
- Resilience Against Hype: Empowering leaders to distinguish between superficial safety claims or "lobby lies" and robust technical benchmarks, thereby accelerating the implementation of safeguards against existential threats.

Desired Mentee Background

Computer Science/ML, Maths, Economics, Law, International Relations, Political Science.

Desired Mentee Level of Education

Undergraduate and above. Must have taken a course that covers ML basics or take an ML course during the semester they work with me on the project.

Other Mentee Requirements

Required:
- Good grades or other indicators of excellence such as projects that achieved meaningful KPIs (detail what you did and specify/quantify how you succeeded please)
- Interest in AI policy

Preferred:
- Understands Machine Learning Basics
- Has regular contact with students in law/policy/IR (or is one)
- Experience creating teaching formats
- Highly skilled at using agentic systems and verifying their outputs
- Solid Python programming skills (Python, Package Management, Git)