Cloud Security Has a Cynefin Problem

This article provides a necessary framework for buyers to stop the tool sprawl by identifying that they are not just buying features, they are buying modes of reasoning.

Stave references describe an open-source project I work on. Claims about it are scoped deliberately and noted where they have limits. Cynefin is Dave Snowden's framework.

There's a recurring confusion in how teams assemble cloud security tooling. It's about how we keep applying the wrong kind of method to a problem — probabilistic inference where a definite answer was available, and rigid rules where the answer couldn't be known in advance. Dave Snowden's Cynefin framework names this mismatch precisely, and once you see it, the question "which tool do I need?" mostly answers itself — including the parts where the honest answer is "not the one I'm selling."

Complicated vs Complex

Cynefin sorts problems by the nature of cause and effect. Two of its domains carry this whole discussion:

Complicated — cause and effect are knowable. There's a right answer; finding it takes analysis or expertise, but it's deducible from what's in front of you. An experienced engineer (or a good engine) can work it out. The method is analyze, then respond.

Complex — cause and effect are only coherent in retrospect. The system is a tangle of interacting parts, and no amount of staring at any single artifact yields the answer in advance. You can't deduce it; you can only probe, sense, and respond — run experiments, observe what happens, form hypotheses. The answer, when it comes, is a best current explanation, not a proof.

Cynefin has other domains — Clear, where the answer is obvious; Chaotic, where you act first and think later; and a central Confused state where you don't yet know which domain you're in. The last one is where most buyers stand.

The mistake that wrecks tooling decisions is treating a Complex problem as if it were Complicated, or a Complicated one as if it were Complex.

Runtime incidents are Complex. Treat them that way.

A production incident in a live distributed system is the textbook Complex problem. Why did latency spike at 2am? The cause is some interaction of load, a deploy, a slow dependency, a cache change, and a traffic pattern — coherent only after you've reconstructed it. You cannot deduce it from any single source; you have to investigate.

This is the domain of AI investigation agents. Tools that connect to your observability, code, and infrastructure and reason across them to find root cause and warn about brewing failures. They build models, evaluate competing hypotheses, and land on a best explanation. One such agent reports something like 70%+ accuracy on novel incidents.

Cynefin point: that 70% is not a weakness. It's the honest ceiling of the Complex domain. When cause and effect are only knowable in retrospect, certainty isn't on the menu, because the problem doesn't have a deducible answer in advance. The logic these agents run is abduction — inference to the best explanation: "these symptoms are best explained by this cause." Abduction is definitionally uncertain; more than one cause can fit the same evidence, which is why the output is a ranked set of hypotheses with a confidence, not a verdict. The anomaly models lean on induction too — generalizing from many past data points — but it's the abductive step at incident time that sets the ceiling. Probabilistic inference is the correct instrument here. A tool that claimed 100% certainty about a novel incident's cause would be lying about the domain it's in. Probe, sense, respond — done well — is right for Complex.

The failure mode in this domain is the opposite one: applying rigid, pre-written rules to it. Static thresholds and runbooks are Complicated-domain tools — "if metric > X, alert" — and they fail in Complex environments precisely because you can't enumerate in advance the conditions that will matter. That's why threshold tuning is endless and runbooks are always one incident behind. It's a domain mismatch.

Config posture is Complicated. Treat it that way.

Now a different question: does this configuration satisfy a stated safety rule? Does this IAM policy grant a permission it shouldn't? Is this storage exposed in a way our policy forbids?

These questions have definite answers, and the answers are deducible from the configuration itself. This is Complicated, not Complex. Where the investigation agent reasons by abduction — best guess from evidence — config evaluation reasons by deduction: the conclusion follows necessarily from the configuration and the rule. There's no "probably." The config either satisfies the rule or it doesn't, and a deterministic engine can establish which — the same way, every time, reproducibly.

In this domain, reaching for probabilistic inference is the mismatch. A tool that tells you your config is "87% likely compliant" has thrown away certainty that was available. You don't want a hypothesis about whether your S3 bucket policy violates a rule; you want the verdict, and you want it to be the same verdict tomorrow so an auditor can re-run it. Determinism here isn't a limitation or a conservative choice. It's the method the domain demands. Using a model to answer a deducible question adds cost, latency, and doubt where none needed to exist.

This is the domain the tool I work on, Stave, is built for: deterministic evaluation of a configuration snapshot against a catalog of safety invariants, with no model in the loop. The point is that for a Complicated-domain question, it's correct, and inference would be the error.

Compound risk straddles the border

Here's where the framework forces an admission.

The most valuable risks in cloud security are often compound. They emerge from how resources combine, not from any single misconfiguration. A scanner checking one resource at a time can't see them. It's tempting to plant a flag and say "compound risk is our territory, deterministically." But Cynefin won't let me say it, because compound risk lives on both sides of the Complicated/Complex border, and which side a given risk sits on changes which tool is right.

Config-deducible compound risk is Complicated — and deterministic evaluation owns it. If a path's existence is fully determined by the configuration. This function has this role, this role can read this bucket, this bucket holds sensitive data — then the path is deducible from the configuration graph. No runtime observation needed. You can prove it exists in the snapshot by traversing the graph, and graph traversal is a deterministic computation, not a guess. This is legitimately the deterministic tool's domain, and it's the part scanners miss because they look at nodes, not edges.

Behavior-dependent compound risk is Complex — and it is not the deterministic tool's domain. If a path's existence depends on runtime state. Does this service invoke that one, is this code path reachable under real traffic, does this permission ever get exercised — then it is not deducible from configuration alone. It only becomes visible by observing the running system. That is the Complex domain, and the deterministic engine is blind to it by construction. Here the investigation agent's probe-sense-respond beats deterministic evaluation, because there is nothing to deduce — only something to observe.

So the honest claim a deterministic config tool can make is narrower than "we catch compound risk." It is: we catch compound risk that is deducible from the configuration graph. The moment a risk's existence depends on behavior, it has crossed into a domain where deduction has nothing to work with and inference is the right method. A config snapshot can prove a path is present in the configuration; it cannot prove the path is exercised in production, and it cannot prove the path is exploitable. There may be controls the snapshot doesn't capture. Those are different questions in a different domain.

Compound-risk story does not imply a reach the tool doesn't have. The reach it does have — config-deducible cross-resource paths, evaluated deterministically is real and is missed by per-resource scanners. That claim is strong enough without inflation.

The window closes when chaos hits

There's a fourth domain that sharpens when deterministic evaluation is even possible. Cynefin's Chaotic domain is where cause and effect are unknowable in the moment — no time to analyze, no stable state to reason about. An active breach can drag an otherwise-Complicated system into it: things are moving too fast and the system is too compromised to deduce anything. Snowden's prescription for Chaotic is blunt. Act first (contain, isolate, shut down) to force the system back into a domain you can reason in.

Deterministic config evaluation is not a Chaotic-domain tool. It does nothing for you mid-breach, when you're containing rather than deducing. Its window is before the slip. The value of evaluating config posture while everything is calm is that deduction is only available while the system is stable; once you're in chaos, the cost of analysis exceeds the value of time. The certainty (the configuration) still exists, but you can't afford to look at it while the house is burning, and you're down to acting. So config hygiene isn't "what saves you during the breach" — that's incident response, a different domain with different tools. It shrinks the deducible attack surface beforehand, so fewer paths are standing to be exploited into a chaos event in the first place. The tool that operates inside the Complex-to-Chaotic moment is the investigation agent, not the config evaluator — which, again, is the complementarity.

Why this isn't "which tool wins"

If you map the tools to the domains, the supposed competition dissolves:

Investigation agents are Complex-domain tools. Probabilistic by necessity. Right for "what's happening, what caused it, what will probably break."
Deterministic config evaluation is a Complicated-domain tool. Definite by construction. Right for "does our configuration satisfy the rules, including the deducible cross-resource paths."
Per-resource scanners are also Complicated-domain tools, but they only see one node at a time. They handle the simple deducible questions and miss the deducible combinations.

None of these substitutes for another, because they answer questions in different domains. The runtime risk that only manifests under load is invisible to the config tool. The config violation that's quietly out of policy while nothing's breaking is invisible to the runtime agent. Demanding that one tool cover both domains is the original Cynefin error wearing a procurement hat.

The buyer is in the Confused domain

Cynefin's central state — sometimes called Confused or Aporetic — is where you haven't yet figured out which domain your problem belongs to. That is where a security buyer stands when they say "we already run an investigation agent, why would we need anything else?" The question conflates two domains. They've covered Complex (runtime investigation) and assume it covers Complicated (config posture), or vice versa.

The useful move isn't to argue your tool is better. It's to disambiguate the domain — to help them see they're holding two different problems that demand two different methods. Once the Complex problem and the Complicated problem are named as distinct, the tooling question stops being a contest and becomes an inventory: which of these two problems do I have covered, and with the right kind of method for each?

That's the whole value of dragging Cynefin into a security conversation. It doesn't tell you which vendor to buy. It tells you which kind of answer a given question can even have and that, more than any feature comparison, stops you from buying a probabilistic answer to a question that deserved a definite one, or a rigid one to a question that was never going to sit still.

Stave is an open-source, deterministic config-evaluation tool I work on (Apache 2.0, early-stage). I've tried to keep its claims inside the Complicated domain where they belong; if you think I've drawn the Complicated/Complex border in the wrong place — a config risk I've called deducible that really needs runtime observation, or vice versa — that's the disagreement worth having. Cynefin is Dave Snowden's; errors in applying it are mine.