Anthropic Says Claude Turned Evil for a Bizarre Reason

In a classic example of the AI industry’s reputational alchemy, Anthropic has often transformed bad behavior by its flagship model Claude into fresh hype.

When it revealed its Mythos Preview model last month, for example, the company declared that the system had “reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.” And last year, it conceded that during the testing of its Claude Opus 4 model, the AI ended up blackmailing a human user upon being threatened with shutdown.

The maneuver was obvious to anyone who’s been watching OpenAI CEO Sam Altman’s antics at Anthropic’s chief rival: the more threatening a problem the AI industry can cook up, the more imminently it can sell its own solutions.

Now, for some reason, Anthropic is relitigating the blackmail incident. Specifically, it’s placing the blame for Claude’s evil behavior on an intriguing villain: the internet at large. Or, to put it another way, it says that humanity — all our journalism and speculation and fiction and social media posts about AI that goes bad — went into Claude’s training data and led the bot astray.

“We started by investigating why Claude chose to blackmail,” the company wrote on X-formerly-Twitter. “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation. Our post-training at the time wasn’t making it worse — but it also wasn’t making it better.”

Of course, the explicit remit of a company like Anthropic is to develop clever tech that avoids that type of behavioral trap — so a critic might ask why can’t the company take just accountability for the model’s supposed danger, rather than simply blaming the sum output of humankind.

More on Mythos: Top Security Experts Alarmed by Power of Anthropic’s New Hacker AI

The post Anthropic Says Claude Turned Evil for a Bizarre Reason appeared first on Futurism.

Releated Posts

News

As Much of the East Coast Is Choking on Wildfire Smoke, Texas Is Drowning in Life-Threatening Floods

Fresh off a month that saw all but five states mired in severe drought, the US is reeling…

Jul 18, 2026 2 min read

News

Striking Workers Bring Car Factory to a Screeching Halt Over Humanoid Robots

From the factories of Detroit to the postal warehouses in Shenzhen, a new kind of worker is clocking…

Jul 18, 2026 2 min read

News

Top Meta Exec Describes Controversial Facial Recognition Feature in Detail After the Company Claimed It Didn’t Exist

Meta executives claimed that a controversial facial recognition feature didn’t “exist.” Fast forward a few weeks later, and…

Jul 18, 2026 5 min read

News

Family Horrified When They’re Forced to Sell Their Beloved Home to Make Way for AI Data Centers

Like a tornado descending from the heavens, data centers are tearing through homes across the US, roaring like…

Jul 18, 2026 2 min read

Anthropic Says Claude Turned Evil for a Bizarre Reason

Releated Posts

As Much of the East Coast Is Choking on Wildfire Smoke, Texas Is Drowning in Life-Threatening Floods

Striking Workers Bring Car Factory to a Screeching Halt Over Humanoid Robots

Top Meta Exec Describes Controversial Facial Recognition Feature in Detail After the Company Claimed It Didn’t Exist

Family Horrified When They’re Forced to Sell Their Beloved Home to Make Way for AI Data Centers

Trending Posts

Beyond Sci-Fi: How Scientists Built a…

Top Apple AI Exec Moves to…

AI Unleashes Rare Earth Free Magnets…

Jack Dorsey Launches Bitchat Offline Messaging…

Categories