272 Experts Named the Risks. Nobody Named the Mechanisms.

MIT's AI Risk Repository surveyed 272 international experts — researchers, practitioners, policymakers from MIT, Harvard, Oxford, Stanford, Tsinghua, national AI safety institutes, and industry using the Delphi method to answer a direct question: which AI risks are most severe, who is most vulnerable, and who is responsible for addressing them?

The headline finding: 18 of 24 AI risk domains carry at least a 10% probability of catastrophic outcomes within the next five years under current trajectories. Catastrophic meaning more than one million deaths, more than $100 billion in damage, or civilization-scale intangible harms like the collapse of democratic norms.

The five most severe risks: dangerous capabilities, competitive dynamics, weapons and cyberattacks, power centralization, and false information. Even under a scenario where pragmatic mitigations are implemented, the probability of catastrophic harm from multiple categories remained above 10%.

This is 272 experts saying: under current practice, the probability of catastrophic outcomes from AI is not small, and current mitigations are not sufficient to bring it below a tolerable threshold. The study is rigorous, the methodology is established, and the finding is clear.

Who is reading it?

The responsibility gap is the translation gap

The study's most structurally important finding isn't about severity. It's about who bears the risk versus who can do something about it.

AI users and the general public are most vulnerable to AI risks. General-purpose AI developers and governance actors are most responsible for addressing them. The study calls this a "responsibility gap" — the people who can act aren't the people who get hurt.

This is the same structural pattern that plays out in every safety-critical industry. The public is most vulnerable to aviation failures, pharmaceutical side effects, nuclear meltdowns. Engineers, manufacturers, and regulators bear primary responsibility for prevention. In those industries, the gap is bridged by mandatory standards, enforcement, liability, and a societal expectation of low risk tolerance. For AI, as the study notes, "comparable mechanisms are nascent or absent."

But there's a gap inside the responsibility gap that the study identifies but doesn't name: between the research that predicts the failures and the engineering teams that are building the systems. The developers who hold responsibility aren't reading the research that tells them their architectures are weak. The researchers who produce the findings aren't translating them into engineering decisions. Nobody in the organizational structure bridges the two. The Gap That Costs More Than Any Bug

The MIT study is itself an instance of the problem it describes. It is rigorous, authoritative, and written for researchers and policymakers. The engineering teams building agent systems — the teams whose architectural choices will determine whether these risks materialize are reading framework blog posts and benchmark results, not Delphi studies. The study says "competitive dynamics carry catastrophic risk." The engineering team sees "Top 30 to Top 5 on Terminal Bench" and ships the improvement. Same world, different information channels, no bridge.

Three findings that connect to specific architectural gaps

The study names risks. It does not name the engineering mechanisms that turn those risks into failures. Here's where the connection lives.

Multi-agent risks — named as catastrophic, unsolved in the dominant architecture

The study defines multi-agent risks as "risks from multi-agent interactions due to incentives or system structure, which can create conflict, collusion, cascading failures, selection pressures, new vulnerabilities, and a lack of shared information and trust." Experts assessed this as carrying catastrophic potential.

The leading agent framework lists multi-agent coordination as an open research problem. The team running the most ambitious agent-generated codebase writes that they don't yet know how architectural coherence evolves over time. The dominant architecture — Model + Harness — has no specification layer, no coordination protocols, and no mechanism for ensuring that independently-generated outputs are globally consistent. The Harness is Half the Architecture

Distributed systems engineering solved this problem decades ago with specifications as coordination protocols, contract enforcement, and interface boundaries. The architectural answer exists. It hasn't reached the teams building the systems the MIT study is warning about. That's the translation gap applied to a specific risk the study names as catastrophic.

Competitive dynamics — the second most severe risk, and the one driving architectural shortcuts

Experts ranked competitive dynamics as the second most severe risk category. The study defines it as "competition by AI developers or state-like actors in an AI race to maximize strategic or economic advantage, increasing the risk they release unsafe and error-prone systems."

This is the specific dynamic that produces the architectural shortcuts the current generation of agent systems is built on. Self-verification instead of independent verification. Because independent verification is slower and harder. Accumulation without subtraction — because generating more is easier than curating what belongs. Generation loops without verification gates between iterations — because gates slow throughput. Each shortcut is a competitive decision: ship faster by skipping the subsystem that would have caught the failure.

The study says this dynamic carries catastrophic probability. The engineering articles from the teams making these decisions say "these guardrails will almost surely dissolve over time" — treating safety mechanisms as temporary scaffolding to be removed once models improve, rather than as permanent peer subsystems of the architecture. Agentic Development Manifesto

AI security vulnerabilities — a named risk category that conflates four distinct problems

The study lists "AI security vulnerabilities and attacks" as a risk category. But it treats it as one thing. In practice, it's at least four:

Application code vulnerabilities — bugs in the source code (buffer overflows, injections). LLM scanning works here.
Application config — misconfigured settings the code correctly applies. Not a code bug.
Infrastructure code — IaC templates with misconfigurations. Linters and scanners work here.
Infrastructure config — the actual deployed state of cloud resources. Configuration posture, compound cross-resource risk, intent verification. A different class entirely.

The industry's current response: LLM-powered source code scanning covers the first category and is being presented as a comprehensive security solution. The other three categories need different tools with different epistemics: deterministic verification against declared invariants, graph evaluation of cross-resource paths, intent checking against human-declared rules. LLMs Can Find Bugs in Your Code. That's One Class of Security Problem.

The MIT study ranks the risk but doesn't make this four-way distinction, which means the teams reading the study won't realize that the security measures they're implementing cover one quadrant and leave three unaddressed. The risk is named. The mechanism is unnamed. The gap persists.

The map: each named risk to the missing element

There is a law — independently derived by engineering (TRIZ), cybernetics (Beer's Viable System Model), and economics (Co-opetition) — that says any viable system must contain five elements: a Tool (the generator — the model), an Engine (declared, machine-checkable intent), a Transmission (CI/CD plus machine-readable contracts), a Control Unit (the independent oracle that measures output against intent and feeds back a deterministic verdict), and a Casing (enforced boundaries the system structurally cannot cross). A system missing any one does not survive. The Harness is Half the Architecture

That gives the mechanism vocabulary the study lacks. Each risk MIT names is what failure looks like from the outside; the missing or weakened element is the mechanism on the inside that produces it.

MIT risk (named)	Missing / weak element	The mechanism the study doesn't name
Multi-agent risks (collusion, cascading failures, lack of shared trust)	Transmission + Engine	No contract layer or shared spec — independently-generated outputs fight over a mutable blackboard with nothing enforcing global consistency, so they collide and cascade.
Competitive dynamics (the race to ship)	Control Unit	The race deletes the independent verification gate; self-verification replaces it because gates slow throughput. An LLM grading an LLM is a second Tool wearing a checker's badge, not an oracle.
AI security vulnerabilities and attacks	Control Unit + Casing	Three of the four security quadrants — config, IaC, deployed infra posture — need a deterministic oracle against declared invariants and enforced boundaries. LLM source-scanning only checks the Tool's code output.
Dangerous capabilities	Casing	Capability is bounded only by advisory guardrails ("these will almost surely dissolve over time"), not by boundaries the structure makes uncrossable.
Weapons and cyberattacks	Casing + Control Unit	No enforced egress/permission boundary on what the generator may reach, and no independent check on what it is allowed to produce.
Power centralization	Casing (scope)	The boundary of the game is unenforced; market concentration is a Scope failure — the structure permits unbounded reach.
False / misleading information	Engine + Control Unit	No durable declared intent to measure against and no deterministic verdict — generation runs open-loop with nothing comparing output to ground truth.

Read down the middle column and the pattern is unmistakable: the Control Unit and the Casing are missing or advisory across nearly every catastrophic category. That is not a coincidence. It is the same structural hole, surfacing under seven different risk names. The study counts the symptoms; the law names the cause.

What the study gets right that the industry ignores

Three implications from the study that directly challenge current practice:

"Significant risks require substantial mitigations." Under most established risk-governance frameworks, a 10% probability of catastrophic outcome over five years is intolerable — triggering mandatory mitigation. The industry is treating these probabilities as acceptable background risk. They are not, by any established standard.

"Relying on developers' voluntary action alone is insufficient." The study says it plainly: "any individual developer that slows down to invest in safety bears competitive cost. Absent external constraints, AI companies have structural reasons not to act on risks." This is the competitive-dynamics risk stated as an incentive problem. The solution the study prescribes — "rules and people to enforce them" — is the same architectural answer the engineering discussion arrives at: mechanical enforcement of declared specifications, not voluntary guardrails. Cloud Security Has a Cynefin Problem

"Some risks may be difficult to address through guardrails on AI models alone." The study says structural dynamics like competition and market concentration "may call for measures like competition policy, labour protections, and governance arrangements alongside technical solutions." This is the systems-thinking argument applied at the societal level: the model is one subsystem, and wrapping it in better guardrails doesn't address risks that live in the interactions between subsystems.

The bridge that doesn't exist

The MIT study is the risk side. The engineering blog posts and framework architectures are the building side. The gap between "272 experts say this is catastrophic" and "the leading framework's architecture contradicts the research" is structural. Nobody's job is to carry the finding from one side to the other in language the other side acts on.

The study will be read by researchers and policymakers. The framework blog posts will be read by engineers. Neither group will read the other's output. The risks the study warns about will materialize through the architectural decisions the blog posts describe and the teams making those decisions will not have seen the study that predicted the outcome.

That's the gap. It costs more than any bug, because bugs are found and fixed. A structural gap between risk research and engineering practice produces failures that were predicted, preventable, and repeated. The same pattern that played out with Lamport and Paxos, with microservices and distributed systems, and now with AI agents and the research that already exists to prevent the coming failures. The Gap That Costs More Than Any Bug

The question is whether the engineering teams building the systems the study warns about will ever see the findings in a form they can act on before the failures arrive.

References: MIT AI Risk Repository, "Prioritizing the risks from Artificial Intelligence" (2026), Delphi study of 272 international experts.

272 Experts Named the Risks. Nobody Named the Mechanisms.

The responsibility gap is the translation gap

Three findings that connect to specific architectural gaps

Multi-agent risks — named as catastrophic, unsolved in the dominant architecture

Competitive dynamics — the second most severe risk, and the one driving architectural shortcuts

AI security vulnerabilities — a named risk category that conflates four distinct problems

The map: each named risk to the missing element

What the study gets right that the industry ignores

The bridge that doesn't exist

Tags

Author

Stats

Published

You Might Also Like

The Principle of Least AI

. .. . ... . .... . .... . ... .

I'm not a developer, but I built a calendar app to fix my most annoying work task

Too cheap to be good? Think again.

The 80/20 Rule of AI Code — Why the Last 20% Takes 80% of Your Time

Internmaxxing vs. Old Man Shakes Fist at Cloud