Claude Prompting Best Practices 2026: The Complete Anthropic Guide (With Copy-Paste Snippets)

Anthropic published a comprehensive official prompting guide covering Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Haiku 4.5. It's dense and long — this article pulls out every actionable technique with the exact prompt snippets you can copy into your system prompts today.

1. Opus 4.7 — What's Different

Opus 4.7 performs well on existing Opus 4.6 prompts out of the box, but several behaviors shifted enough that Anthropic dedicates the first section of the guide to tuning. The most important changes:

Response length is now task-calibrated

Opus 4.7 decides its own response length based on perceived task complexity — not a fixed verbosity default. Simple lookups get short answers; open-ended analysis gets long ones. If your product requires consistent length, you now need to specify it explicitly.

Verbosity control

Provide concise, focused responses. Skip non-essential context, and keep examples minimal.

More literal instruction following

Opus 4.7 takes instructions at face value more than 4.6. It won't silently generalize from one item to another or infer requests you didn't make. If an instruction should apply broadly, say so explicitly.

❌ 4.6 behavior (infers scope)

"Format the first section as a table" → Model may format all sections

✓ 4.7 needs explicit scope

"Format every section as a table, not just the first one" → Model applies instruction everywhere

Tone shifted: more direct, fewer emoji

Opus 4.7 is more direct and opinionated by default. Less validation-forward phrasing ("Great question!"), fewer emoji than 4.6's warmer style. If your product voice requires warmth:

Warmth prompt

Use a warm, collaborative tone. Acknowledge the user's framing before answering.

Design defaults: Opus 4.7 has an opinionated house style

For frontend/UI generation, Opus 4.7 defaults to: warm cream/off-white backgrounds (~#F4F1EA), serif display fonts (Georgia, Fraunces, Playfair), italic accents, terracotta/amber. This fits editorial and hospitality, but will look wrong for dashboards, fintech, dev tools, or enterprise apps.

Generic instructions like "don't use cream" just switch it to a different fixed palette. Two things that actually work:

Option A — Specify design direction concretely

Design a [type] application.
Visual direction: [cold monochrome / dark editorial / minimal enterprise / etc].
Color palette: [exact hex values].
Typography: [font choices].
Corner radius: [value] consistently across cards, buttons, inputs.

Option B — Have Claude propose options first

Before building, propose 4 distinct visual directions tailored to this brief
(each as: bg hex / accent hex / typeface — one-line rationale).
Ask the user to pick one, then implement only that direction.

Anti-"AI slop" frontend snippet

<frontend_aesthetics>
NEVER use generic AI-generated aesthetics like overused font families
(Inter, Roboto, Arial, system fonts), cliched color schemes (particularly
purple gradients on white or dark backgrounds), predictable layouts and
component patterns, and cookie-cutter design that lacks context-specific
character. Use unique fonts, cohesive colors and themes, and animations
for effects and micro-interactions.
</frontend_aesthetics>

Code review: more literal = apparently lower recall

If your code review harness uses language like "only report high-severity issues" or "be conservative," Opus 4.7 will follow that more faithfully — which can make it appear to miss bugs that it actually found but filtered out. Fix with:

Code review coverage prompt

Report every issue you find, including ones you are uncertain about or
consider low-severity. Do not filter for importance or confidence at this
stage — a separate verification step will do that. Your goal here is
coverage: it is better to surface a finding that later gets filtered out
than to silently drop a real bug. For each finding, include your confidence
level and an estimated severity so a downstream filter can rank them.

2. Effort Levels — The Most Important Parameter

The effort parameter is Anthropic's primary knob for tuning intelligence vs. cost vs. speed. Opus 4.7 respects effort more strictly than any prior model — especially at the low end, where it scopes work tightly rather than going above and beyond.

Level	Use for	Trade-off
`max`	Intelligence-demanding tasks, hard reasoning	Most tokens, can overthink
`xhigh` New	Coding and agentic use cases	Best for most coding/agent work
`high`	Most intelligence-sensitive tasks	Good balance, recommended minimum
`medium`	Cost-sensitive tasks, reduce token usage	Less intelligence
`low`	Short tasks, latency-sensitive, not reasoning-heavy	Fast, cheap, shallow reasoning

Key rule: If you see shallow reasoning on complex problems, raise effort — don't prompt around it. If you need low effort for latency but the task involves reasoning, add: "This task involves multi-step reasoning. Think carefully through the problem before responding."

Token budget for max/xhigh effort: Set max_tokens to at least 64k when running at max or xhigh. The model needs room to think across subagents and tool calls.

3. General Principles

The golden rule of clarity

Anthropic frames it this way: think of Claude as a brilliant but new employee who lacks context on your norms. Show your prompt to a colleague with minimal context and ask them to follow it. If they'd be confused, Claude will be too.

❌ Vague

Create an analytics dashboard

✓ Specific with scope

Create an analytics dashboard. Include as many relevant features and interactions as possible. Go beyond the basics to create a fully-featured implementation.

Give context, not just rules

Explaining why a constraint exists lets Claude generalize correctly. Without context, rules get followed literally in ways you didn't intend.

❌ Rule without context

NEVER use ellipses

✓ Rule with explanation

Your response will be read aloud by a text-to-speech engine, so never use ellipses — the TTS engine won't know how to pronounce them.

Use examples (3–5 is the sweet spot)

Few-shot examples are one of the most reliable output steering techniques. Anthropic recommends wrapping them in <example> tags so Claude distinguishes them from instructions. Make examples diverse enough that Claude doesn't pick up unintended patterns from repetition.

Structure with XML tags

When your prompt mixes instructions, context, examples, and variable inputs, XML tags prevent misinterpretation. Consistent naming across prompts trains more reliable behavior.

XML structure pattern

<instructions>
  [What Claude should do]
</instructions>

<context>
  [Background information]
</context>

<examples>
  <example>[Example 1]</example>
  <example>[Example 2]</example>
</examples>

<input>
  {{USER_INPUT}}
</input>

Long context (20k+ tokens): put documents first, query last

Placing long documents at the top of the prompt and your query/instructions at the bottom can improve response quality by up to 30% on complex multi-document inputs. Structure multiple documents with XML:

Multi-document structure

<documents>
  <document index="1">
    <source>annual_report_2025.pdf</source>
    <document_content>{{ANNUAL_REPORT}}</document_content>
  </document>
  <document index="2">
    <source>competitor_analysis.xlsx</source>
    <document_content>{{COMPETITOR_ANALYSIS}}</document_content>
  </document>
</documents>

Analyze both documents. Identify strategic advantages and recommend focus areas.

4. Output & Formatting Control

Tell Claude what to write, not what to avoid

Positive instructions ("write in flowing prose paragraphs") outperform negative ones ("do not use markdown"). The model generalizes better from affirmative directions.

❌ Negative instruction

Do not use markdown in your response

✓ Positive direction

Your response should be composed of smoothly flowing prose paragraphs.

Reduce bullet-point overuse

Claude models tend to over-bullet by default. For long-form content, use this snippet:

Prose over bullets

<avoid_excessive_markdown_and_bullet_points>
Write in clear, flowing prose using complete paragraphs. Reserve markdown
for inline code, code blocks, and simple headings. Avoid bold, italics,
and bullet lists unless presenting truly discrete items or the user
explicitly requests a list. Incorporate items naturally into sentences
rather than fragmenting information into bullet points.
</avoid_excessive_markdown_and_bullet_points>

Prefilled responses are deprecated

Starting with Claude 4.6, prefilled responses on the last assistant turn are deprecated. Migrate away now:

For format control: Use structured outputs or explicit format instructions
To skip preambles: Add "Respond directly without preamble. Do not start with 'Here is…', 'Based on…'"
For continuations: Move to user message: "Your previous response ended with `[text]`. Continue from where you left off."

5. Tool Use

Explicit action vs. suggestion

Claude models are trained for precise instruction following. "Can you suggest..." means suggest. If you want action, use imperative language.

❌ Gets suggestions only

Can you suggest some changes to improve this function?

✓ Gets actual edits

Change this function to improve its performance.

Proactive action default

<default_to_action>
By default, implement changes rather than only suggesting them. If the
user's intent is unclear, infer the most useful likely action and proceed,
using tools to discover any missing details instead of guessing.
</default_to_action>

Conservative action default (for cautious agents)

<do_not_act_before_instructions>
Do not jump into implementation or change files unless clearly instructed.
When the user's intent is ambiguous, default to providing information and
recommendations rather than taking action.
</do_not_act_before_instructions>

Maximize parallel tool calls

Claude excels at parallel execution — reading multiple files simultaneously, running parallel searches, batching independent operations. Add this to any agent system prompt:

Parallel tool calling snippet

<use_parallel_tool_calls>
If you intend to call multiple tools and there are no dependencies between
them, make all independent tool calls in parallel. When reading 3 files,
run 3 tool calls simultaneously. Maximize parallel tool use for speed
and efficiency. Only call tools sequentially when later calls depend on
earlier results.
</use_parallel_tool_calls>

6. Thinking & Reasoning

Adaptive thinking replaces budget_tokens

Claude 4.6+ uses adaptive thinking (thinking: {type: "adaptive"}) instead of manual extended thinking with budget_tokens. The model decides when and how much to think based on effort level and query complexity. Anthropic reports adaptive thinking outperforms extended thinking in internal evals.

Migration: extended → adaptive thinking (Python)

# Before (deprecated):
client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=64000,
    thinking={"type": "enabled", "budget_tokens": 32000},
    messages=[{"role": "user", "content": "..."}],
)

# After (current):
client.messages.create(
    model="claude-opus-4-7",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    output_config={"effort": "high"},  # or xhigh, medium, low
    messages=[{"role": "user", "content": "..."}],
)

Prevent overthinking

At higher effort levels, Opus 4.6/4.7 may think excessively on straightforward tasks. Steer with:

Thinking trigger control

Extended thinking adds latency and should only be used when it will
meaningfully improve answer quality — typically for problems that require
multi-step reasoning. When in doubt, respond directly.

Prevent decision paralysis

When deciding how to approach a problem, choose an approach and commit
to it. Avoid revisiting decisions unless you encounter new information
that directly contradicts your reasoning. If weighing two approaches,
pick one and see it through.

7. Agentic Systems

State management across context windows

Claude 4.6+ can work across multiple context windows. For long-running tasks, use structured state files:

Structured formats (JSON) for test results, task status, progress tracking
Freeform text for progress notes and context summaries
Git for session-to-session state — Claude models are excellent at using git logs to resume work

Long-running task context prompt

Your context window will be automatically compacted as it approaches its
limit, allowing you to continue working indefinitely. Do not stop tasks
early due to token budget concerns. As you approach your token budget
limit, save your current progress and state before the context window
refreshes. Be as persistent and autonomous as possible and complete tasks
fully, even if budget is approaching.

Autonomy vs. safety boundaries

Without guidance, Claude may take irreversible actions (deleting files, force-pushing, posting to external services). Define the boundary explicitly:

Reversibility safety prompt

Consider the reversibility and impact of your actions. Take local,
reversible actions freely (editing files, running tests). For actions
that are hard to reverse, affect shared systems, or could be destructive,
ask the user before proceeding.

Actions requiring confirmation:
- Destructive: deleting files/branches, dropping tables, rm -rf
- Hard to reverse: git push --force, git reset --hard, amending commits
- Visible to others: pushing code, commenting on PRs, sending messages

Prevent overengineering

Claude 4.5 and 4.6 have a tendency to over-build — adding abstractions, extra files, unnecessary configurability. Constrain with:

Anti-overengineering prompt

<keep_it_minimal>
Only make changes that are directly requested or clearly necessary.
Don't add features, refactor code, or make improvements beyond what
was asked. Don't add docstrings or comments to code you didn't change.
Don't add error handling for scenarios that can't happen. Don't create
helpers for one-time operations. The right amount of complexity is the
minimum needed for the current task.
</keep_it_minimal>

Prevent test-hacking

Prevent hardcoding / goodhart's law

Write a general-purpose solution that works correctly for all valid inputs,
not just the test cases. Do not hard-code values or create solutions that
only work for specific test inputs. Tests verify correctness — they don't
define the solution. If tests are incorrect or the task is infeasible,
tell me rather than working around them.

Prevent hallucinations in code work

Grounded code answers

<investigate_before_answering>
Never speculate about code you have not opened. If the user references
a specific file, read the file before answering. Investigate relevant
files BEFORE answering questions about the codebase. Never make claims
about code before investigating — give grounded, hallucination-free answers.
</investigate_before_answering>

8. Quick-Reference Snippet Library

All the most useful prompt snippets from Anthropic's guide, ready to copy:

Verbosity control

Reduce verbosity

Provide concise, focused responses. Skip non-essential context, and keep examples minimal.

Skip preambles

No preamble

Respond directly without preamble. Do not start with phrases like "Here is...", "Based on...", "Certainly!", or "Great question!".

After tool calls — show reasoning

Post-tool summary

After completing a task that involves tool use, provide a quick summary of the work you've done.

Research — structured approach

Structured research prompt

Search for this information in a structured way. As you gather data, develop
several competing hypotheses. Track your confidence levels in your progress
notes. Regularly self-critique your approach and update a hypothesis tree
or research notes file. Break down this complex research task systematically.

Context window management

Context awareness

This is a long task. Plan your work clearly. Work through components
completely before moving on. Continue working systematically until the
task is complete — do not stop early due to context concerns.

Save every snippet you use

PromptChief is built for exactly this — store your best system prompt snippets, tag them by model, combine them into full system prompts, and track what works. No more losing snippets in Notion docs.

Try PromptChief Free →

Claude Prompting Best Practices 2026: The Complete Guide With Copy-Paste Snippets

1. Opus 4.7 — What's Different

Response length is now task-calibrated

More literal instruction following

Tone shifted: more direct, fewer emoji

Design defaults: Opus 4.7 has an opinionated house style

Code review: more literal = apparently lower recall

2. Effort Levels — The Most Important Parameter

3. General Principles

The golden rule of clarity

Give context, not just rules

Use examples (3–5 is the sweet spot)

Structure with XML tags

Long context (20k+ tokens): put documents first, query last

4. Output & Formatting Control

Tell Claude what to write, not what to avoid

Reduce bullet-point overuse

Prefilled responses are deprecated

5. Tool Use

Explicit action vs. suggestion

Maximize parallel tool calls

6. Thinking & Reasoning

Adaptive thinking replaces budget_tokens

Prevent overthinking

7. Agentic Systems

State management across context windows

Autonomy vs. safety boundaries

Prevent overengineering

Prevent test-hacking

Prevent hallucinations in code work

8. Quick-Reference Snippet Library

Verbosity control

Skip preambles

After tool calls — show reasoning

Research — structured approach

Context window management

Save every snippet you use

Claude Prompting Best Practices 2026:
The Complete Guide With Copy-Paste Snippets