SAIGE

A Meta-Analysis of the AI Safety Research Landscape

Mentor: Ihor Kendiukhov
Project area: Technical AI Safety

Project Language

English only.

Minimum Time Commitment

10 hours per week.

Project Abstract

This proposal outlines a comprehensive metaresearch project designed to systematically analyze the field of AI safety. As investment in AI capabilities accelerates, it is critical to understand whether safety research is keeping pace, addressing the most pressing risks, and allocating resources effectively. This project will create a data-driven map of the AI safety research landscape, empirically evaluate the validity of current safety benchmarks, and analyze the strategic ecosystem of funding and talent that shapes the field. The ultimate goal is to produce actionable recommendations for researchers, funders, and policymakers to identify and fill critical gaps, fostering a more robust and effective safety ecosystem.

Theory of Change

By creating a clear, empirical, and objective picture of the AI safety field's strengths, weaknesses, and systemic biases, this project will empower key decision-makers to act more strategically.

Informing Funders: Our findings will provide philanthropic and government funders with a map of neglected areas, allowing them to direct resources toward high-impact, underfunded research.

Guiding Researchers: The landscape map will help new and existing researchers identify promising, high-impact research directions that are currently being overlooked.

Improving Governance: Our analysis of benchmarks and evaluation practices will inform the development of more robust standards for AI safety, helping policymakers and third-party auditors avoid the pitfalls of "safetywashing".

Ultimately, this research acts as a feedback mechanism for the AI safety ecosystem itself, helping it to become more self-aware and effective in its mission to reduce catastrophic AI risks.

Desired Mentee Background

Computer Science/ML, Maths, Cognitive Science.

Desired Mentee Level of Education

Any level.

Other Mentee Requirements

Mentees must know the basics of alignment theory and why alignment is hard (i.e. understand Yudkowsky's arguments).