SAIGE

Risk-Weighted Compute Permits Under Imperfect Monitoring: Enforcement Design and an EU-Implementable Blueprint

Mentor: Joel Christoph
Project area: Technical AI governance, compute governance, enforcement design, mechanism design, economics of frontier AI oversight

Project Language

English only.

Minimum Time Commitment

8 hours per week.

Project Abstract

This project develops and stress-tests a concrete governance instrument for frontier AI: risk-weighted tradable compute permits with enforceable compliance under imperfect monitoring. The core idea is to regulate training-relevant compute as a scarce, auditable input while allowing trade to reduce compliance cost and improve feasibility. “Risk-weighted” means the permits required per unit of compute depend on verifiable risk indicators, especially evaluation outcomes, so higher-risk training runs face tighter effective caps and higher marginal cost.

The project has three workstreams.

(1) Formal model: a regulator sets an aggregate cap, a risk-weighting rule, and an enforcement policy; developers choose compute use, reporting, and evasion effort under imperfect monitoring. We derive implementable conditions for truthful reporting and deterrence of under-reporting or hidden training.

(2) Minimal simulation: we implement a lightweight Monte Carlo or agent-based simulation comparing enforcement regimes such as random audits, risk-based targeting, convex penalties, and escalation rules, documenting tradeoffs between compliance, expected harm reduction, and administrative burden.

(3) Policy translation: we convert the design into an EU-relevant blueprint specifying institutional roles, evidentiary standards, audit triggers, and interfaces with evaluations and incident reporting.

Expected outputs by the end of the main phase are:

(a) a public preprint with a clear proposition spine and robustness checks,

(b) an open-source repository with simulation code and reproducible figures, and

(c) a 6 to 10 page policy brief describing an EU-implementable pathway.

If the project extends, we will polish for submission and stakeholder feedback.

Theory of Change

Catastrophic risk from frontier AI is amplified by incentives to scale capabilities faster than safety and governance can keep up, and by weak compliance when monitoring is imperfect. Compute is one of the few levers that can be metered, capped, and coordinated across jurisdictions, but compute governance fails if developers can under-report or shift activity without credible expected penalties.

This project contributes by designing a compute governance mechanism that remains enforceable under imperfect observability. Risk-weighted permits connect the regulated substrate (compute) to measurable risk signals (evaluations), allowing tighter constraints where risk is higher while preserving flexibility for lower-risk activity. A clear audit and penalty architecture makes truthful reporting the best response and reduces the probability of unmonitored frontier training runs. The blueprint output is designed to be implementable: it clarifies minimal monitoring assumptions, specifies enforcement choices, and provides a focal design that German and EU stakeholders can debate, pilot, and refine. Over time, this increases the chance that evaluations and oversight meaningfully constrain real-world scaling, lowering catastrophic risk.

Desired Mentee Background

Computer Science/ML, Maths, Economics, Law, International Relations, Political Science

Desired Mentee Level of Education

Masters and above.

Other Mentee Requirements

Strong quantitative reasoning and reliability.
Ability to write clear English.
Comfort reading formal models and translating them into precise prose. At least one of:
(a) game theory or mechanism design exposure,
(b) strong quantitative microeconomics, or
(c) strong Python ability for simulations.

Consistent weekly progress is mandatory.