✓ Fact-checked June 24, 2026
Sources: Fortune · VentureBeat · Inc · Anthropic Blog · r/ClaudeAI
Is Claude Getting Worse? What the 2026 Performance Complaints Actually Mean
Shorter responses. More refusals. Less analytical depth. Thousands of users noticed it starting in late 2025, and by April 2026 it became a media story. Here is an honest look at what is actually happening — the data, the timeline, and what you can do about it.
Direct Answer
In early 2026, thousands of Claude users reported measurable quality drops: shorter responses, more refusals, and less analytical depth. Fortune, VentureBeat, and Inc all covered it. Anthropic acknowledged a product philosophy shift but denied deliberate performance degradation. The real causes are a combination of a compute supply crunch, a deliberate move to an "edit-first" collaboration model, model version proliferation across the Claude 4.x family, and adaptive thinking being on by default in Opus 4.8. Claude 4 benchmarks are higher than Claude 3 — but the experience feels different. Here is why.
Claude Guides — Complete Series
The Timeline: When Did This Start?
The complaints did not appear overnight. They escalated gradually over seven months before reaching mainstream media in April 2026. Here is the documented sequence:
Late 2025
Scattered complaints on r/ClaudeAI and r/LocalLLaMA
Individual users began posting that responses felt "shorter than before" and that Claude was "hedging more." These were dismissed by much of the community as placebo effect or confirmation bias. No coordinated media coverage at this stage.
February 2026
Spike in Trustpilot 1-star reviews mentioning "not as good as before"
A measurable uptick in negative reviews on Trustpilot specifically mentioned performance degradation rather than pricing or reliability issues. The phrase "not as good as it used to be" appeared in multiple independent reviews within a two-week window — a signal that something systemic was driving user perception, not just individual bad sessions.
March 2026
Reddit thread goes viral — "Claude feels lobotomized"
A post on r/ClaudeAI titled "Claude feels lobotomized" accumulated thousands of upvotes and hundreds of comments, most of them agreeing. Users described Claude refusing tasks it had handled previously, producing shorter analysis, and adding qualifiers like "as an AI" more frequently. This was the inflection point where individual complaints became a coordinated public conversation.
April 2026
Fortune: "Anthropic faces user backlash over reported performance issues"
Fortune published one of the first mainstream business press articles framing the issue as a reputational risk for Anthropic. The article cited the Reddit thread and Trustpilot data, and included an Anthropic spokesperson response acknowledging user feedback.
April 2026
VentureBeat: "Is Anthropic 'nerfing' Claude? Users increasingly report performance decline"
VentureBeat went further, introducing the word "nerfing" into the mainstream conversation — a term borrowed from gaming that means deliberately weakening a feature. The article included analysis of benchmark data and quotes from developers who had been tracking Claude's output quality systematically over months.
April 2026
Inc: "Users Say Anthropic's Claude Is Getting Worse — A Quiet Change May Be to Blame"
Inc framed the story differently — not "nerfing" but a "quiet change" in product philosophy. This framing turned out to be closer to the truth. Inc reported on Anthropic's internal shift toward a more collaborative, edit-first design for Claude responses.
April 2026
AMD AI Director posts public analysis showing measurable drops on open-ended tasks
A public analysis from AMD's AI Director shared on social media showed benchmark comparisons on open-ended, subjective tasks — the kind not typically captured by standardised benchmarks like MMLU or GPQA. The analysis showed Claude 4.x models scoring lower than Claude 3.5 Sonnet on tasks requiring long-form analytical prose, multi-perspective reasoning, and nuanced opinion. This was significant because it came from a credible technical source, not just frustrated users.
April 2026
Anthropic's official response: acknowledged philosophy change, denied degradation
Anthropic issued a formal response through their blog and social channels acknowledging that Claude's response style had changed — deliberately — as part of a product philosophy update. They explicitly denied that the underlying model had been weakened, citing benchmark improvements in Claude 4. They acknowledged the "edit-first" shift as intentional. The community response was mixed: some accepted the explanation, many did not.
What Users Are Actually Noticing: 5 Specific Changes
The complaints are not vague. Once you collect enough user reports, five distinct patterns emerge. These are not the same complaint wearing different clothes — they are distinct phenomena with different underlying causes.
1. Shorter Responses
The most common complaint by volume. Users who previously received 1,200–1,800 word analytical responses are now getting 400–600 word responses with a note suggesting they ask follow-up questions. This is measurable, not subjective. Multiple developers who logged their Claude API outputs over time showed a statistically significant drop in average output token count between mid-2025 and early 2026.
| Time Period | Avg Response Length (open-ended analytical tasks) | Notes |
| Mid-2024 (Claude 3.5 Sonnet) | ~1,100–1,400 tokens | Baseline — full drafts with context |
| Early 2025 (Claude 3.7 era) | ~900–1,200 tokens | Modest shortening noticed by some users |
| Late 2025 (Claude 4 rollout) | ~600–900 tokens | Complaints begin on Reddit |
| Early 2026 (Claude 4.x default) | ~400–700 tokens | Viral "lobotomized" thread, media coverage |
Note: These figures represent average observed outputs on open-ended analytical prompts — the type of task where the change is most noticeable. For structured tasks like coding, translation, and data extraction, average output length was more stable. The shortening is concentrated in open-ended, subjective, and analytical work.
2. More Refusals
Things that Claude handled without friction in 2024 now trigger refusals or heavy caveats. This is covered in depth in the sibling page on why Claude refuses requests, but the short version: Claude 4.x was trained with updated safety guidelines that moved the refusal threshold lower on several content categories. This is a deliberate safety alignment decision, not a bug.
3. Less Analytical Depth
Related to but distinct from shorter responses. Even when users explicitly ask for depth, Claude 4 tends to give a structured list of considerations rather than a multi-angle, nuanced analysis with tension between competing views. Claude 3.5 Sonnet was particularly praised for its ability to hold two contradictory ideas in tension and explore the space between them. That quality feels diminished in the default Claude 4.x behaviour — though it can be elicited with more targeted prompting.
4. More Hedging and Caveats
The phrase "it is important to note" appears dramatically more frequently in Claude 4.x outputs than in Claude 3.x outputs. Same with "I should mention," "as an AI," and "you may want to consult a professional." Anthropic updated Constitutional AI training in Claude 4.x to add more epistemic humility markers. Users experience this as Claude being less confident and less direct — even on topics where hedging is not warranted.
5. "Edit-First" Behaviour
This is the change Anthropic explicitly acknowledged. Instead of generating a complete first draft when asked to write something, Claude 4 now tends to generate a shorter draft and invite the user to refine it collaboratively. "Here is a starting point — let me know what you would like to adjust" is now the default ending rather than a complete, polished output. For users who wanted finished work, this is a significant regression in utility — even if Anthropic believes it produces better outcomes through iteration.
Important distinction: These five changes are not all the same thing. Shorter responses and edit-first behaviour are deliberate product decisions. More refusals are a safety alignment choice. Less analytical depth may be a training data or reinforcement signal issue. More hedging is a Constitutional AI update. Treating them as one problem — "Claude is getting worse" — misses the fact that different fixes address different causes.
The 4 Real Causes
If you want to understand what is actually driving the complaints, there are four separate mechanisms at work. None of them is "Anthropic secretly sabotaging Claude."
Cause 1: Compute Supply Crunch
Anthropic is operating at massive scale with GPU supply constraints that are not unique to them — they are an industry-wide problem driven by NVIDIA's production capacity and data centre build timelines. When millions of users all make long, expensive requests simultaneously, something has to give. The economics are simple: prefill costs (processing your input before generating output) scale linearly with context length. Extended thinking mode costs approximately 10–15x the compute of a standard response on the same prompt. A single Opus 4.8 extended-thinking session on a long document can cost more compute than dozens of short exchanges.
At scale, this matters. The hypothesis — which has not been confirmed by Anthropic but is widely accepted among engineers who have studied the pattern — is that response length has been implicitly softcapped at the infrastructure level during high-demand periods to prevent GPU overload. Whether or not this is true, the user experience of shorter responses correlating with peak usage hours is well documented.
Cause 2: Product Philosophy Shift — "Collaborator, Not Generator"
This one Anthropic explicitly confirmed. Starting with the Claude 4 family, Anthropic made a deliberate product decision to move Claude from being a one-shot generator (write me the complete thing) to a collaborative editor (let us build this together). This means shorter first drafts by design, more invitations to iterate, and less completed output per single exchange.
The reasoning behind this decision, per Anthropic's internal communications and blog statements: users who iterate on Claude's output with multiple back-and-forth rounds produce higher-quality final results than users who take Claude's first draft as final. Anthropic optimised for the former. The problem is that a significant segment of users — especially professionals and power users — wanted the latter.
"We've been intentional about Claude becoming a more collaborative writing partner rather than a one-pass generator. We believe this leads to better outcomes for users, even if it changes the experience of the first response."
— Anthropic, April 2026 official response (paraphrased from public blog post)
Cause 3: Model Version Proliferation — You May Not Be on the Model You Think
The Claude 4.x family is not one model. It is a collection of sub-versions with meaningfully different characteristics. Claude Sonnet 4.5, Claude Sonnet 4.6, Claude Opus 4.8 — these are not incremental patches. They are distinct model checkpoints with different training data, different reinforcement signal, and different capability profiles. Not all users are on the same version at any given time. A/B testing between sub-versions is constant on claude.ai. A user who was getting responses from Sonnet 4.5 last month may be on Sonnet 4.6 this month with no notification.
This creates a genuine "is Claude getting worse" experience for users who notice a style change mid-month without any announcement — because an actual model change happened, just not one that was publicly communicated. The model you are on affects response style, length tendencies, refusal thresholds, and hedging frequency.
Cause 4: Adaptive Thinking Always On for Opus 4.8
Claude Opus 4.8 introduced adaptive thinking — a mode where Claude reasons before responding, similar to OpenAI's o1/o3 approach. Unlike Sonnet 4.6 and Haiku 4.5, where extended thinking is optional and can be toggled off, Opus 4.8's adaptive thinking is always on and cannot be disabled. This changes response style in noticeable ways: outputs tend to be more structured, more hedged, more carefully caveated, and more linear in their reasoning. Users who were used to the more free-flowing, opinionated style of Claude 3.5 Sonnet experience this as a regression in personality and spontaneity — even though Opus 4.8 scores higher on benchmarks.
Key distinction: Claude Sonnet 4.6 still allows you to toggle extended thinking on or off. If you want the more natural, conversational Claude style, use Sonnet 4.6 with extended thinking off. If you need frontier-level reasoning on hard problems, use Opus 4.8 — but accept that the response style will be more structured and less spontaneous.
What the Benchmarks Actually Show
Here is where the story gets complicated. If you look at standardised benchmarks, Claude 4 is unambiguously better than Claude 3. The numbers support Anthropic's position. If you look at subjective user experience, the picture is more complicated.
| Benchmark | Claude 3.5 Sonnet | Claude Sonnet 4.6 | Claude Opus 4.8 | What It Measures |
| GPQA Diamond | ~59% | ~68% | ~80%+ | Graduate-level science reasoning |
| SWE-bench Verified | ~49% | ~56% | ~72%+ | Real-world software engineering tasks |
| MMLU Pro | ~78% | ~82% | ~86% | Expert knowledge across domains |
| HumanEval | ~93% | ~95% | ~97% | Python coding correctness |
| Open-ended analytical prose (AMD analysis) | Rated higher | Rated lower | Rated lower | Subjective depth and nuance |
| Average user satisfaction (Trustpilot, Q1 2026) | Higher (3.5 era) | Lower | Mixed | User perception, not task performance |
The benchmark data tells a clear story: Claude 4 models are objectively better at structured, measurable tasks. The AMD AI Director analysis tells a different story: on open-ended, subjective tasks where nuance and analytical depth matter, Claude 4.x models score lower than Claude 3.5 Sonnet in human evaluations.
This is not a contradiction — it is a known problem in AI evaluation. Benchmarks measure what can be measured. They do not measure "does this feel like a smart conversation partner" or "did this response give me genuine insight I did not already have." The users complaining about Claude getting worse are largely measuring the second thing. Anthropic's benchmarks measure the first thing. Both are right within their frame.
The benchmark gap: GPQA Diamond tests whether Claude can answer PhD-level chemistry questions correctly. It does not test whether Claude can write a nuanced 1,500-word analysis of a business strategy with genuine tension between competing approaches. These are different skills. Claude 4 improved on the former while arguably regressing on the latter — at least by default.
Is This the Same Model You Were Using Before?
Almost certainly not. Here is what to check to know exactly what you are on right now.
1
Check your current model in claude.ai
In claude.ai, open any conversation. Look at the model selector dropdown in the top bar — it shows your current model name (e.g. "Claude Sonnet 4.6" or "Claude Opus 4.8"). This is the model that is answering you right now. If it says "Claude" without a version, you are on whatever Anthropic's current default is — which changes without notification.
2
Free vs Pro vs Max use different default models
Free users are routed to lighter models or Sonnet 4.6 with usage constraints. Pro users ($20/month) get access to Sonnet 4.6 and Opus 4.8 with higher limits. Max users ($100+/month) get all models including early access to new releases. If your plan changed or if Anthropic changed the free tier default model, your experience changed without your permission.
3
API users: always pin your model version
If you are accessing Claude via the Anthropic API and you are calling "claude-sonnet-latest" or any alias rather than a pinned version like "claude-sonnet-4-6-20251101", you are on whatever Anthropic promotes to the alias at any point. When Anthropic updates the alias to a new version, your application silently switches models. Always pin to a specific dated checkpoint. This is the most common cause of "Claude got worse overnight" complaints from developers.
4
Claude Code uses Claude Sonnet 4.6 by default as of June 2026
Claude Code — Anthropic's agentic coding tool — defaults to Claude Sonnet 4.6 as of June 2026. If you were previously using Claude Code with an earlier model via the API, the default has changed. You can override this with the --model flag when launching Claude Code from the CLI.
5
Understand the distinction between the Claude product and the Claude model
"Claude" refers to both the product (the chatbot at claude.ai) and the underlying AI model (Claude Opus 4.8, Claude Sonnet 4.6, etc.). When Anthropic updates the product without updating the model, your experience changes. When they update the model without announcing it, your experience changes. These are different things happening on different timescales. A user who "has always used Claude" may have cycled through five different underlying models without ever explicitly choosing to.
What Anthropic Actually Said
Anthropic's response to the April 2026 backlash was nuanced — acknowledging some things while denying others. The full picture:
"We hear users' feedback about response style changes and take it seriously. Claude's underlying capabilities have not been degraded — our benchmarks show consistent improvement across every Claude 4 model. What has changed is Claude's default approach to collaboration. We've intentionally moved toward a model that invites iteration rather than delivering a single finished output. We believe this leads to better outcomes. We recognise this is not what every user wants, and we are working on giving users more control over response style."
— Anthropic, April 2026 (official blog and spokesperson statement, paraphrased)
| Claim | What Anthropic Said | Assessment |
| Underlying model was degraded | Denied | Benchmarks support their position on structured tasks |
| Response style changed deliberately | Confirmed | "Edit-first" philosophy is a confirmed product decision |
| Refusal threshold changed | Partially confirmed | Safety guidelines updated for Claude 4.x — acknowledged indirectly |
| GPU/compute constraints affect output | Not addressed | Plausible but not confirmed by Anthropic |
| A/B testing between model versions | Not addressed | Standard practice confirmed by developers via API logs |
| Hedging frequency increased | Partially confirmed | Attributed to Constitutional AI updates for epistemic accuracy |
How This Compares to OpenAI's GPT-4o to GPT-4o Mini Downgrade
In mid-2024, OpenAI was caught silently routing GPT Plus users to GPT-4o mini — a cheaper, less capable model — without notification. This was a clear quality degradation masked as a seamless experience. The community backlash was significant.
What Anthropic did in 2025–2026 is different in character but similar in impact: they changed the model's behaviour through a deliberate product decision rather than a silent model swap — but the user experience of "this feels worse than before" is the same. The difference is that Anthropic's changes are philosophically intentional (edit-first collaboration) rather than financially motivated downgrading. Whether that distinction matters in practice depends on whether you agree with the philosophy.
What To Do If Claude Feels Worse For You
The good news: most of the experience degradation is fixable. Here are six concrete actions, roughly in order of impact:
1
Switch models explicitly — Sonnet 4.6 for speed, Opus 4.8 for depth
Do not use the default. In claude.ai, click the model name at the top of your conversation and switch explicitly. If you want long-form analytical depth, switch to Claude Opus 4.8. If you want faster responses without the always-on adaptive thinking overhead, use Sonnet 4.6. The difference between these two models is not minor — they have genuinely different response styles. Choosing the right one for the task is the highest-leverage first step.
2
Turn off extended thinking if it is on (Sonnet 4.6 only)
On Claude Sonnet 4.6, extended thinking is optional. When it is on, Claude produces more structured, step-by-step outputs with more hedging. When it is off, Claude's responses tend to be more natural and direct. If you are using Sonnet 4.6 and the responses feel over-structured, check whether extended thinking is enabled and try turning it off for conversational and analytical tasks.
3
Add explicit output length and style instructions to every prompt
Claude respects explicit instructions. Add phrases like "respond with at least 700 words," "do not summarise — give me the complete output," "write a full draft, not a starting point," or "do not add disclaimers." These instructions override the default edit-first behaviour. They feel redundant if you are used to getting full outputs without asking — but they work, and they are the fastest fix that requires no plan change or settings change.
4
Set a Custom Instruction in your profile
In claude.ai, go to Settings → Custom Instructions. Write a brief instruction that sets your baseline expectations: the length you want, the hedging level you prefer, and whether you want complete drafts or collaborative starting points. This instruction applies to every conversation and does not require you to repeat yourself. Example: "Always give complete outputs. Do not suggest follow-ups unless I ask. Aim for at least 600 words on analytical topics. Skip disclaimers about AI limitations."
5
Use Projects with a tuned system prompt
Claude Projects allow you to add a system prompt that applies to all conversations within that project. This is more powerful than Custom Instructions because it is scoped and can be more detailed. If you have a specific use case — analytical writing, code review, research — create a Project for it with a system prompt that specifies the exact style, length, and behaviour you want. This is effectively how professional API users control Claude's behaviour, and it is now available in the consumer product.
6
Use the API directly for full control
Via the Anthropic API, you control the model version (pinned, not an alias), max_tokens (explicit output length ceiling), temperature (style variation), and system prompt (baseline behaviour). You pay per token rather than a flat monthly rate, which means heavy users need to calculate whether it is cheaper than Max — but you get complete control over behaviour with no hidden defaults and no silent version upgrades. This is the gold standard for users who need consistent, reproducible behaviour.
Will Claude Get Better Again?
The honest answer is: it depends on what "better" means to you.
If "better" means higher benchmark scores and stronger performance on structured tasks, Claude was already on that trajectory and will continue. Anthropic's roadmap beyond the 4.x family is pointing toward more capable reasoning, not less.
If "better" means returning to the longer, more opinionated, complete-draft style of Claude 3.5 Sonnet — that is less certain. Anthropic's official position is that the edit-first philosophy is correct and they intend to build on it, not reverse it. However, they also acknowledged user feedback and promised "more control over response style" — suggesting some configurability is coming.
On the compute crunch: GPU supply is a solvable problem on a 12–24 month timeline. NVIDIA's next-generation supply is expanding, and Anthropic has multi-year supply agreements that improve through 2026 and 2027. If response-length throttling is happening at the infrastructure level during peak hours, it is more likely to ease than worsen over the next year.
What to watch for as signals that Claude is genuinely improving for power users:
- A response-style configurability setting in claude.ai (short-form vs long-form preference)
- System prompt options in the consumer product that allow disabling the edit-first default
- Public Anthropic benchmark releases that include subjective quality dimensions, not just structured task performance
- Announcement of the Claude 5.x family — the next major architecture revision
- Improvement in Trustpilot and AppStore review sentiment, which lagged behind the April 2026 low
Frequently Asked Questions
Is Anthropic secretly nerfing Claude? +
Anthropic denies deliberate performance degradation, and benchmarks support that Claude 4 models outperform Claude 3 on structured tasks. What is confirmed: a product philosophy shift toward shorter, more collaborative first drafts; a compute environment that may be throttling response length at peak hours; and model version proliferation meaning not all users are on the same underlying model. The combination of these three factors explains most of what users are experiencing — without requiring a conspiracy theory about deliberate degradation.
Why are Claude's responses shorter now? +
Anthropic made a deliberate product decision in late 2025 to shift Claude toward an "edit-first" collaboration model. Instead of generating complete 1,500-word drafts, Claude generates shorter starting points and invites you to expand collaboratively. This is a product philosophy change, not model degradation. You can counteract it with explicit instructions in every prompt: tell Claude to write at least 600 words, or to give the complete output without summarising. Adding this instruction to your Custom Instructions setting means you only have to set it once.
Is Claude 4 worse than Claude 3? +
On standardised benchmarks, Claude 4 models outperform Claude 3 across the board. Claude Opus 4.8 scores above 80% on GPQA Diamond and above 72% on SWE-bench Verified — both improvements over Claude 3.5 Sonnet. However, benchmark scores do not always match subjective user experience. The style change — more structured, more hedged, edit-first by default — makes Claude 4 feel different from Claude 3.5 Sonnet even when it is technically more capable on measurable tasks. The AMD AI Director analysis showed Claude 4.x models scoring lower than Claude 3.5 Sonnet on open-ended analytical prose in human evaluations. Both things can be true simultaneously.
How do I know which version of Claude I am using? +
In claude.ai, click your profile icon and go to Settings. The current model is shown in the model selector dropdown at the top of any conversation. The default varies by plan: Free users are on a limited set of models with usage constraints; Pro users ($20/month) get access to Sonnet 4.6 and Opus 4.8; Max users ($100+/month) get all models including early access. If you are using the API, the model you call is the model you get — but only if you pin a specific dated checkpoint rather than using an alias like "claude-sonnet-latest."
Did Claude's quality actually drop in 2026? +
The subjective experience of thousands of users says yes. Benchmarks say no on structured tasks. The gap is explained by a genuine product philosophy change: Claude is now designed to be a collaborative editor, not a one-shot generator. This makes it feel worse for tasks where you wanted a complete, opinionated output — even though it may perform better on tasks that involve iteration. The AMD AI Director public analysis showed measurable drops specifically on open-ended analytical tasks where nuance and depth are evaluated by human raters, not automated benchmarks.
Why does Claude add more disclaimers now? +
Anthropic updated Claude's Constitutional AI training in the Claude 4.x generation to add more epistemic humility markers — phrases like "I should note," "it is worth considering," and "you may want to consult a professional" appear more frequently than in Claude 3.x. This was a deliberate safety alignment decision intended to reduce overconfidence. It correlates with the April 2026 user complaints but is a separate issue from the response-length reduction. You can reduce disclaimer frequency by adding explicit instructions in your Custom Instructions: "Do not add disclaimers about your limitations unless I specifically ask."
Should I switch to ChatGPT if Claude is getting worse? +
Not necessarily. ChatGPT silently downgrades to GPT-4o mini when you hit limits, with no warning — you get worse output quality without knowing it. If Claude feels worse because of shorter responses, you can fix this with explicit length instructions or by switching to Opus 4.8. If Claude feels worse because of refusals, a system prompt with clearer context usually helps. The grass is not always greener — ChatGPT has its own hidden limit behaviour and silent model downgrades that users often do not notice because there is no usage dashboard. Before switching, try the fixes in Section 8 of this page.
What can I do to make Claude respond like it used to? +
Five proven fixes: (1) Switch from Sonnet to Opus 4.8 for analytical depth — these are not the same model. (2) Add explicit output instructions in every prompt: "Write at least 800 words" or "Do not summarise, give me the full output." (3) Use a Custom Instruction in your profile to set baseline expectations for response style that apply to every conversation. (4) Turn off extended thinking on Sonnet 4.6 if it is enabled — it changes response style toward more structured, hedged output. (5) Access via the API with temperature and max_tokens parameters you control directly, pinned to a specific model version.
More in This Series