Anthropic released Claude Opus 4.7 on April 16, 2026, and the timing makes it easy to underestimate. It is the third Opus model in six months, following Opus 4.5 and Opus 4.6, and on the surface the name looks like incremental maintenance. It is not. The coding benchmarks jumped in ways that matter for anyone building with AI. Vision capabilities tripled in resolution. A new reasoning tier arrived. And the instruction-following behaviour changed significantly enough that prompts written for Opus 4.6 will produce different results — in some cases, breaking entirely. This is a genuine upgrade that comes with a genuine migration responsibility.

This blog breaks down every meaningful difference between Claude Opus 4.7 and Opus 4.6, including the benchmark numbers, the new features, the API breaking changes, the tokenizer cost shift, and a clear answer to whether upgrading makes sense for your specific use case. Whether you are a developer building agentic systems, an enterprise user running complex workflows, or a business considering AI development partnerships for your products, the changes in 4.7 affect how you should plan and budget.

Where Claude Opus 4.7 Sits in the Bigger Picture

Before getting into the differences, it is worth understanding where Opus 4.7 sits in Anthropic’s current model hierarchy. As of April 2026, Anthropic’s most capable model is actually Claude Mythos Preview, announced on April 7, 2026 under Project Glasswing. Mythos leads Opus 4.7 across essentially every benchmark category — coding, reasoning, vision, and agentic tasks. However, Mythos Preview is not generally available. Access is restricted to a limited set of platform partners as part of Anthropic’s cautious staged rollout.

That means Claude Opus 4.7 is the best model any developer, business, or individual user can actually access today. It is available across Claude.ai on Pro, Max, Team, and Enterprise plans, the Anthropic Messages API under the model ID claude-opus-4-7, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry, Snowflake Cortex, and GitHub Copilot on Pro+, Business, and Enterprise tiers. Pricing is held at $5 per million input tokens and $25 per million output tokens — identical to Opus 4.6. The context window is 1 million input tokens with a 128K output ceiling.

If you or your development team work with AI-powered products and need help evaluating which model tier fits your architecture, Think To Share’s AI and software development services can help map model capabilities to real product requirements.

Change 1: Coding Benchmarks Took a Significant Step Forward

Coding Benchmarks

The most immediately meaningful change in Claude Opus 4.7 is in coding performance, and the numbers are concrete enough to evaluate directly.

On SWE-bench Verified — the industry’s standard benchmark for autonomous software engineering on real-world GitHub issues — Opus 4.7 scores 87.6%, up from 80.8% in Opus 4.6. That is a 6.8 percentage point improvement. On SWE-bench Pro, which tests harder multi-file tasks and is considered a tougher measure of agentic coding capability, Opus 4.7 scores 64.3% against Opus 4.6’s 53.4% — an 11-point gain. CursorBench, a coding benchmark run by the Cursor development team on production-style tasks, went from 58% to 70%, a 12-point jump. Notion AI’s AI Lead Sarah Sachs, quoted in Anthropic’s official release, described the improvement as “+14% over Opus 4.6 at fewer tokens and a third of the tool errors.”

For context, Opus 4.7’s SWE-bench Pro score of 64.3% also beats GPT-5.4 at 57.7% and Gemini 3.1 Pro at 54.2% on the same benchmark, making it the top-performing generally available model on complex autonomous coding tasks as of this writing.

For anyone running custom software development projects or agentic coding pipelines, these numbers translate to real productivity differences. More tasks completed without human intervention, fewer tool errors, and fewer mid-task corrections required. On an internal benchmark Anthropic ran on autonomous coding workflows, Opus 4.7 reportedly completes roughly three times as many production tasks end-to-end compared to Opus 4.6 — the kind of improvement that changes the economics of AI-assisted development.

In Claude Code specifically, Anthropic’s agentic coding environment, a new /ultrareview command was added. This runs a dedicated multi-agent review session that systematically scans completed code for bugs, design issues, and edge case failures. It is described in early developer reports as meaningfully more thorough than the review process available in Opus 4.6, though detailed internal documentation on its architecture has not been fully published.

Change 2: Vision Resolution More Than Tripled

Vision Resolution

The vision upgrade in Opus 4.7 is arguably the most visually dramatic change between the two models, and it is one that extends the practical usefulness of Claude into new categories of work.

Opus 4.6 supported images up to 1,568 pixels on the long edge, equivalent to approximately 1.15 megapixels. Opus 4.7 accepts images up to 2,576 pixels on the long edge, equivalent to approximately 3.75 megapixels — a 3.3× increase in pixel count. This is a model-level change with no API parameter required to activate it. Visual acuity on standardised testing jumped from 54.5% to 98.5%.

What this means practically is significant. Computer-use agents reading dense, high-resolution screenshots can now parse UI elements, text, and data that would have been too small or too compressed to interpret reliably in Opus 4.6. Complex diagrams, engineering schematics, financial charts with fine-grained data, and document layouts with small print are now within the model’s reliable processing range. For any workflow involving document data extraction, interface automation, or multimodal analysis, Opus 4.7 opens up tasks that required preprocessing or resolution workarounds in 4.6.

Businesses building AI-integrated applications that handle visual inputs — whether for document processing, interface testing, or content analysis — will find this upgrade meaningfully expands what is architecturally feasible without adding complexity to the pipeline.

Change 3: A New xhigh Effort Level and Adaptive Thinking

New xhigh Effort Level

Opus 4.7 introduces a new effort level called xhigh, which sits between the existing high and max settings. This is more significant than it might first appear as a feature note, because it fundamentally changes how developers tune the reasoning-depth-to-cost tradeoff on hard problems.

In Opus 4.6 and earlier versions, the available effort settings were low, medium, high, and max. The jump from high to max was steep — max imposed no token ceiling on reasoning, which gave it excellent performance on genuinely difficult tasks but made it slow and expensive for workloads that needed something better than high without justifying full max treatment. The gap was frustrating for production work, where cost and latency matter alongside output quality.

Opus 4.7’s xhigh level fills that gap precisely. At 100K reasoning tokens, xhigh achieves 71% on internal coding evaluations — already ahead of Opus 4.6’s max at 200K tokens. That means you get better results than the previous model’s top setting while using half the reasoning tokens. Claude Code now defaults to xhigh for all subscriber plans, and Anthropic’s own guidance recommends xhigh as the primary starting point for most complex agentic workflows, with max reserved for genuinely exceptional cases.

Alongside xhigh, Opus 4.7 ships adaptive thinking as the new approach to reasoning control. Instead of setting a fixed budget_tokens parameter as in Opus 4.6, the model determines how much reasoning to allocate based on the perceived difficulty of the task. This is a significant behavioural shift that also constitutes a breaking API change, which is covered in detail below.

Also arriving alongside the model release is Task Budgets, currently in public beta. Task Budgets let developers set a hard token ceiling on an entire agentic loop — covering thinking, tool calls, tool results, and final output combined. The model is given a running countdown and uses it to prioritise its work and wrap gracefully as the budget approaches. For anyone building long-running autonomous agents where cost predictability matters, this is a practical tool for avoiding runaway inference spend. You enable it via the task-budgets-2026-03-13 beta header in the API.

Change 4: Stricter Instruction Following

Stricter Instruction

This is the quietest change in Opus 4.7, and the one most likely to cause problems in production for teams that do not account for it before migrating.

Opus 4.6 interpreted instructions loosely. It would frequently infer unstated intent, fill in gaps based on what seemed most helpful, silently resolve contradictions within a prompt, and sometimes skip steps it judged as unnecessary. This behaviour made it forgiving for loosely written prompts and gave it a somewhat collaborative quality — it would meet you halfway. Opus 4.7 does not do this. Anthropic’s own migration documentation explicitly states that the model “will not silently generalise an instruction from one item to another, and will not infer requests you didn’t make.”

In practical terms: if your prompt says “respond in JSON,” Opus 4.7 returns JSON and nothing else — no prose preamble, no closing explanation. If it says “write exactly three functions,” Opus 4.7 writes three functions even if a fourth would produce more complete code. If your system prompt contains contradictory instructions that Opus 4.6 silently resolved in a particular direction, Opus 4.7 will follow them literally and produce unexpected output.

Beyond breaking changes, there are several related behavioural shifts worth knowing. Response length now adapts to task complexity rather than defaulting to thoroughness. Fewer tool calls happen by default at a given effort level — you raise effort to increase tool-call depth. The model’s tone is more direct and less hedged. And in long agentic traces, it provides more frequent progress updates, which can be useful for debugging and user experience but increases output token consumption.

The implication for any team running Opus 4.6 prompts in production is clear: do not treat this as a drop-in model swap. Audit every system prompt, flag instructions that relied on the model’s loose interpretation, resolve ambiguity explicitly, and remove guardrails that were compensating for 4.6’s willingness to work around unclear instructions.

Breaking API Changes: What Developers Must Check

Breaking API Changes

There are four breaking changes in the Messages API between Opus 4.6 and Opus 4.7 that will cause 400 errors if not addressed before migrating production traffic.

First, extended thinking budgets using the budget_tokens parameter are removed in Opus 4.7. If your integration passes a budget_tokens value with "type": "enabled" thinking configuration, the API will return a 400 error. The correct approach in Opus 4.7 is to use "type": "adaptive" with no budget_tokens field.

Second, non-default sampling parameters — specifically temperature, top_p, and top_k — are no longer supported. If your code passes any of these at non-default values, you will receive a 400 error. The approach in Opus 4.7 is to control output characteristics through explicit prompting rather than sampling parameter manipulation.

Third, thinking content is now omitted from API responses by default. In Opus 4.6, reasoning content was included in the response stream. In Opus 4.7, you must explicitly opt in by adding "display": "summarized" to your request if you need access to the reasoning trace.

Fourth, response prefill has been removed as a feature. Any integration that relied on prefill patterns to guide response structure needs to be updated to use explicit structured output instructions or JSON mode instead.

For teams using Claude Code, Anthropic built a migration skill that handles the mechanical parts. Running /claude-api migrate this project to claude-opus-4-7 inside Claude Code will update model IDs, parameters, prefill patterns, and effort settings, after which you review the diff before committing.

The Tokenizer Change and Its Real Cost Impact

The Tokenizer Change

This is the detail that gets the least attention in comparison articles and deserves the most attention from finance-conscious engineering teams.

The headline pricing for Opus 4.7 is identical to Opus 4.6 — $5 per million input tokens and $25 per million output tokens. The same batch API discount of 50% and prompt caching discounts of up to 90% also apply. On paper, nothing changed.

In practice, Opus 4.7 ships with an updated tokenizer that maps the same input text to approximately 1.0× to 1.35× more tokens than the Opus 4.6 tokenizer, varying by content type. Code-heavy content and certain document types see the largest multipliers. Plain prose tends to be closer to the 1.0× baseline. Anthropic acknowledges this in the migration documentation and recommends that teams use the /v1/messages/count_tokens endpoint to benchmark their specific content against the new tokenizer before switching production traffic.

The actual cost impact in production is more nuanced than a simple 35% increase, however. At higher effort levels, Opus 4.7’s improved reasoning efficiency means it completes tasks in fewer total tokens despite the tokenizer overhead. On the internal agentic coding benchmark Anthropic published, token usage actually decreased by up to 50% compared to using Opus 4.6 at equivalent quality levels. For high-effort agentic work, the net result can be cost reduction. For high-volume image processing where content types attract the higher end of the tokenizer multiplier, the economics deserve careful measurement before committing to migration.

For any business assessing the cost implications of switching AI model infrastructure, Think To Share’s web development and AI integration consulting can help build a proper cost model for the migration.

What Stayed the Same

Knowing what did not change matters as much as knowing what did. The context window remains 1 million input tokens with no long-context pricing premium — a notable differentiator compared to models that charge per additional context tier. The price per million tokens did not change. Batch processing discounts and prompt caching savings are unchanged. The supported modalities — text and vision input, text output — are the same. The model remains available across the same cloud platforms. And Opus 4.6 has not been deprecated; it remains available during a transition period for teams that need more time to complete migration.

Who Should Upgrade and When

The practical decision comes down to your primary use case.

If you are running autonomous or agentic coding workflows, building AI-powered development tools, or using Claude Code for complex multi-file software work, Opus 4.7 is a clear upgrade and worth prioritising. The SWE-bench improvements are large enough to produce noticeable differences in task completion rate, and the xhigh effort level and Task Budgets beta give you meaningful new controls for production reliability. The /ultrareview command alone is worth evaluating for teams running continuous AI-driven code review.

If your primary use case is vision and multimodal document processing, the 3.3× resolution jump is a genuine capability unlock. Tasks that required preprocessing or resolution workarounds in Opus 4.6 may simply work in Opus 4.7.

If you are running stable, prompt-tuned production workloads where your Opus 4.6 system prompts are well-optimised and your cost models are tightly calibrated, the migration is worth doing carefully rather than urgently. Audit your prompts for reliance on 4.6’s loose interpretation, benchmark the tokenizer cost impact on your specific content types, and update your API integration for the four breaking changes before switching production traffic. The upgrade is worth it — just plan it rather than rushing it.

If your use case is general conversational assistance, research tasks, or content work rather than code or vision, the difference between 4.6 and 4.7 is smaller but still present. More literal instruction-following produces more predictable outputs. Better reasoning efficiency can reduce token usage. The knowledge cutoff moved from May 2025 to January 2026, giving you eight additional months of training data. There is no reason not to upgrade eventually, but no urgency if Opus 4.6 is serving your needs today.

If you are exploring how to integrate Claude Opus 4.7 into a product or workflow for your business — whether for software development, data processing, customer-facing applications, or internal automation — Think To Share’s digital transformation and AI integration services are available to help evaluate and implement the right approach.

Claude Opus 4.7 vs Opus 4.6: What Actually Changed?