Why Restricting AI Code Security Tools Is the Wrong Answer — and What AppSec Programs Actually Need 

I signed the Free Fable letter at freefable.org. I want to explain why — and why the reasoning behind it matters for AI code security beyond any single AI model. 

Cybersecurity defenders are not just critics of technology. We are the builders and operators of the systems that keep real organizations running under pressure. When a tool that has become part of the defensive workflow gets restricted, it is worth asking a harder question than “is this model dangerous?” The correct question is: does restricting it actually reduce risk, or does it primarily slow down the people working to make systems safer? 

That distinction matters enormously right now. 

Fable 5 Is a Meaningful Step Forward — and That Cuts Both Ways 

On June 9, 2026, Anthropic released Claude Fable 5 — the first publicly available model built on the Mythos 5 weights that power Anthropic’s frontier research tier. The capabilities are genuinely significant: a transformational step in what code generation can sustain. Fable 5 handles complex, multi-step agentic tasks at a level prior generations could not match, with the kind of extended, multi-file reasoning that mirrors how modern software actually gets built. Early reaction from development teams has been strong, and the enthusiasm is warranted. 

Here is the thing nobody is saying loudly enough: it is changing the velocity at which insecure code ships, too. 

That asymmetry — Mythos-class models accelerate both secure and insecure code production at the same rate — is what AppSec programs need to reckon with. Not just the cybersecurity classifier restrictions. Not just the pricing. The fundamental acceleration of AI code security risk at scale. 

The Restriction Problem 

To understand the restriction, it helps to understand the relationship between Fable and Claude Mythos. The Claude Mythos cybersecurity debate starts here: Mythos 5 is Anthropic’s unrestricted frontier model, currently accessible only through a limited trusted access program — reserved for a small set of well-resourced organizations that can meet Anthropic’s vetting requirements — because, as Anthropic has stated, its capabilities in cybersecurity and biology are advanced enough that they could be misused. The company is not overstating that risk. Earlier model generations in the Opus 4.x line demonstrated enough cyber capability that Anthropic experimented with deliberately reducing offensive security performance during training before shipping them. 

Fable 5 is Mythos 5 with guardrails. The same underlying weights, with safety classifiers that intercept security-adjacent queries and fall back to Opus 4.8 before responding. At $10 per million input tokens and $50 per million output tokens — 67 to 100 percent higher than comparable frontier offerings — teams are paying a significant premium for a model that downgrades itself on precisely the queries a security team needs. And the classifiers are, by Anthropic’s own acknowledgment, currently broader than ideal. 

For defenders, this is not an abstract inconvenience. Tools like Fable were becoming part of real defensive workflows: helping teams detect vulnerabilities, triage alerts, analyze code, and automate the kind of repetitive review work that consumes security engineering time. There is an enormous mountain of security debt to fix in shipping products right now, and a new mountain of AI-generated code that teams need to secure. The challenge of producing secure AI code at scale is exactly where defenders needed tools like Fable most. That is the part of this debate that gets lost when the conversation collapses into a simple “dangerous model versus safe model” argument. 

The Dual-Use Reality 

There is no contradiction in saying that advanced AI models can be misused and also saying that restricting one of them is the wrong response. Almost every meaningful security tool has dual-use potential. Compilers, debuggers, password crackers, scanners, exploit mitigations, fuzzers, cryptography itself — all of these have been used by both attackers and defenders throughout the history of the field. The policy question has never been whether a technology carries risk. It has always been whether a specific restriction meaningfully reduces that risk, or whether it mainly burdens the people trying to make systems safer. 

When the capability Anthropic is trying to contain exists across multiple models — including foreign and open-weight models that are catching up quickly — restricting one U.S. model does not make that capability disappear. It removes a high-quality, safety-instrumented tool from the hands of defenders who were using it responsibly. And replatforming is not instantaneous. Security programs standardize on specific providers for real operational reasons: cost, contracts, APIs, workflows, logging, procurement, and institutional trust accumulated over time. Telling defenders to “just use something else” ignores how production security teams actually operate. 

We have seen this movie before. In the late 1990s, the U.S. government treated strong cryptography as something that could be effectively controlled through export restrictions. The result was not a safer world. It weakened products, created friction for legitimate builders, hurt U.S. competitiveness, and did very little to stop determined adversaries from developing or acquiring strong cryptography independently. Eventually, policy reality caught up with technical reality: strong cryptography was reproducible, globally available, and essential to security. The restrictions were lifted. 

The same logic applies to AI. If the underlying capability is reproducible and globally distributed, restricting one provider’s model manages the appearance of risk control while making defense harder. That is not a trade worth making. 

The right answer is not “no safety.” The right answer is transparent, evidence-based safety. If models are going to be restricted, those standards need to be scientific, publicly documented where possible, consistent across providers, and narrowly tailored to the actual risk being managed. Otherwise, we are not managing risk — we are performing it. 

The Bigger Problem Fable 5 Reveals 

Pulling tools away from defenders and calling it security is a mistake. But so is assuming that the restriction is the only problem worth solving. The release of Fable 5 — guardrails and all — makes a deeper issue impossible to avoid. 

AI-generated code carries the same vulnerability classes as human-written code: injection flaws, broken authentication, insecure deserialization, secrets left in plaintext, vulnerable dependencies, misconfigured infrastructure. None of these emerge from a model’s safety settings. They emerge from the code itself, regardless of how it was produced. At the velocity Mythos-class generation enables, those flaws do not just appear — they compound. 

The assumption that a sufficiently capable AI model could both write the code and audit it was always fragile. Fable 5’s restrictions make the fragility explicit. But even an unrestricted frontier model running security queries at $10 to $50 per million tokens is not a viable security control at commit volume. The organizations that win in this environment will not be the ones generating the most code. They will be the ones that can review, verify, patch, and ship securely at AI speed. That is a fundamentally different problem than the one most AppSec programs are currently structured to solve. 

AI Code Security Needs a Dedicated Control Plane 

What AI-speed development actually demands is AppSec functioning as an automated security control plane — not a downstream review step, but an inline governance layer that runs immediately when code is generated. 

The pipeline looks like this: the AI or developer generates code → automated controls run immediately → AI suggests a fix → evidence is produced → a human approves risk-sensitive changes → a policy gate decides whether it can merge. No step is optional. No step depends on a frontier model being willing to engage with security content. 

That control plane needs to cover the full surface. SAST for static vulnerability detection across the codebase. Software composition analysis for vulnerable dependencies. Secrets scanning to catch credentials before they reach a branch. Infrastructure-as-code scanning for misconfigured cloud resources. Container scanning for image vulnerabilities. DAST and API testing for runtime behavior. Each layer covers a class of risk the others do not. Removing any of them creates a blind spot that AI-accelerated code volumes will eventually fill. 

Running this stack at AI speed is what separates programs that are scaling with development from programs that are falling behind it. The cost argument is decisive: the combined cost of the full toolchain runs at a fraction of what selective frontier API calls cost at volume. And unlike a frontier model, none of these tools route findings to a less capable engine when the query touches something sensitive. This is what genuine AI code security solutions look like: automated, multi-layered, and independent of any single model’s willingness to engage with security content. 

Veracode’s platform is built for exactly this model of AppSec-as-control-plane — one of the most comprehensive AI code security solutions available today — broad, automated, multi-signal coverage that runs at development velocity regardless of which AI model is generating the code. Where it goes further is closing the loop on remediation. Veracode Fix applies AI-powered auto-remediation to findings — generating developer-ready fixes grounded in Veracode’s proprietary dataset of real-world vulnerabilities and validated remediations. That is not a frontier model guessing at a fix from generic training data. It is targeted guidance tied directly to the specific finding in the specific codebase, producing the evidence trail that the policy gate needs to make a merge decision. 

Advanced models like Fable 5 still have a role, applied selectively to the hardest, most context-dependent risk scenarios where model reasoning genuinely changes the finding. But selective use of advanced models and scalable automated coverage of the routine workload are not competing philosophies. They are the same architecture. 

Measuring What Actually Matters 

One more shift that Fable 5’s release makes harder to avoid: AppSec success cannot be measured by findings created. It has to be measured by risk fixed and shipped. 

A program running the full control plane stack at AI speed, surfacing findings that get remediated and merged, is demonstrably more mature than one running a single frontier model that generates a large findings backlog with no enforcement gate. Volume of findings is a lagging indicator of tooling coverage. Rate of risk reduction is the metric that tells security and engineering leadership whether the program is actually working. 

Restricting AI models without providing defenders better tools does not make that metric improve. The only way to truly secure AI code at scale is to build the control plane that surrounds it — not to limit access to the models generating it. 

The organizations moving in that direction are not waiting for model restrictions to loosen or inference costs to drop. They are building now — because the code velocity is already here, and it is not slowing down. 

FAQ 

Why does Fable 5 restrict cybersecurity queries? Anthropic’s Responsible Scaling Policy ties deployment restrictions to model capability thresholds. The Claude Mythos cybersecurity classification is central to this: Mythos 5’s capabilities in offensive cybersecurity are advanced enough that Anthropic routes those queries to a less capable model rather than risk misuse. The classifiers are currently broader than intended and catch legitimate defensive work too. 

Is SAST alone sufficient for AI-speed development? No. SAST is a critical layer for static vulnerability detection, but the full surface requires SCA for dependencies, secrets scanning, IaC scanning, container scanning, DAST for runtime behavior, and fuzzing for edge cases where possible. Each covers a risk class the others do not. The combined cost of this stack is a fraction of frontier API calls at volume — and none of it depends on a model’s willingness to engage with security content. 

How should AppSec success be measured in an AI-assisted development environment? By risk fixed and shipped — not by number of findings created. A findings backlog with no enforcement gate is a lagging indicator of tool coverage, not program effectiveness. Rate of risk reduction, time-to-remediation, and policy gate pass rates give security and engineering leadership a clearer picture of whether the program is scaling with development velocity. 

How should security teams allocate between frontier AI models and automated scanning? Advanced models like Fable 5 are best deployed selectively — on the hardest, most context-dependent risk scenarios where model reasoning changes the finding. Automated scanning handles the routine security workload at scale and speed. These are not competing approaches; they are complementary layers in the same control plane. 

Download the full 2026 State of Software Security report now to see the security debt implications of AI-generated code.

2026 State of Software Security Download Now