Convergence or Divergence? The Future of Frontier AI Capabilities and Implications for Catastrophic Risk
Mentor: Pavel Kocourek
Project area: Economics Theory / Game Theory
Project Language
Minimum Time Commitment
16 hours per week.
Project Abstract
Before ChatGPT, many expected Google to dominate AI through its unmatched data assets. Instead, OpenAI leapfrogged incumbents — only for competitors like Anthropic, Google, and Meta to close the gap within months. Open-source models now trail the frontier by roughly six months to a year. This pattern raises a fundamental question for AI safety: will the capabilities of frontier AI models continue to converge, or will they diverge — and what does each scenario imply for catastrophic risk?
Existing work has either documented capability convergence empirically (e.g., Stanford AI Index, AISI Frontier Trends Report, Epoch AI) or modeled AI races game-theoretically with a focus on the safety-versus-speed tradeoff (e.g., Han et al., 2022; Armstrong et al., 2016). Industrial organization analyses of AI market structure (Vipra & Korinek, 2024; Gans, 2024) largely set aside the safety implications of their findings. This project aims to bridge the gap by asking: given the strategic interaction between frontier labs, which market structure is more likely to emerge — and what does this mean for governance?
The project will examine three key drivers of convergence and divergence — training data, algorithmic advances, and compute costs — with particular attention to the role of data. Several forces may sustain convergence: shared access to public training corpora, rapid diffusion of algorithmic innovations, and falling costs of replicating frontier performance. Other forces may drive divergence: escalating training costs, proprietary synthetic data pipelines, and potential first-mover advantages from self-improving AI systems.
The project will combine qualitative analysis of the AI industry landscape with game-theoretic modeling — ranging from simple strategic-form games to innovation race models, calibrated to mentee skills. The intended outputs are a research blog post accessible to the AI safety community and an accompanying formal analysis. Strong mentee contributions during the program could lead to coauthorship on a subsequent economics research paper.
Theory of Change
Bad frameworks produce bad decisions. The question of machine moral status will increasingly affect AI development and governance. Currently, most people reasoning about it lack adequate conceptual tools. This matters for catastrophic risk in several ways.
Under-reaction: if AI systems develop welfare-relevant internal states and we lack frameworks to recognize this, we may create systems with misaligned interests while dismissing their signals as "mere computation." A system that experiences something like suffering under certain conditions, and whose operators dismiss this, is a system with reason to deceive.
Over-reaction: anthropomorphizing systems that lack morally relevant properties wastes attention and resources, and may constrain beneficial AI development without corresponding benefit.
Poor discourse: without shared conceptual foundations, public debate about AI consciousness polarizes between dismissive and credulous positions. Neither serves good governance.
The primer addresses these by training researchers and practitioners to reason carefully across multiple frameworks, recognize what each assumes, and navigate uncertainty without false confidence. The German focus (incorporating European philosophical traditions, piloting with German-speaking users) builds SAIGE's national infrastructure while contributing to the broader field.
Conceptual clarity is infrastructure. This project builds it.
Desired Mentee Background
Maths, Economics.
Desired Mentee Level of Education
Undergraduate and above.
Other Mentee Requirements
Familiarity with basic game theory (e.g., Nash equilibrium, extensive-form games) is required. Interest in or familiarity with the AI industry landscape is a strong plus but not strictly necessary. No programming is required, though the ability to run simple computational exercises (e.g., in Python) would be a bonus for exploring numerical examples of game-theoretic models.