Write a Prompt for Claude
Purpose
A self-contained, opinionated checklist for writing high-quality prompts for Claude's latest models (Opus 4.8 / 4.7 / 4.6, Sonnet 4.6, Haiku 4.5). It distills Anthropic's canonical prompt-engineering guidance — the overview and prompting best practices pages — into a single reference you can pull in whenever you draft or refine a prompt. Read-only knowledge artifact: it tells you (or an agent acting on your behalf) how to construct a prompt; it does not call any API. The LaTeX/math-formatting guidance from the source is intentionally excluded.
When to Use
- You're drafting a new system prompt or user prompt and want it right the first time.
- An existing prompt is underperforming and you need a structured way to diagnose and fix it.
- You're migrating prompts to a newer Claude model and behavior has shifted (verbosity, tool triggering, thinking, prefill).
- You're building an agentic harness (coding agent, research agent, long-horizon task runner) and need prompt patterns for autonomy, state tracking, and safety.
- You want to refresh this guide from source — the canonical pages are fetchable as clean Markdown (see Workflow step 0 and Gotchas).
Workflow
Optimal retrieval path (how this guide stays current). The canonical docs serve clean Markdown at the same URL with a .md suffix — no browser, login, or scraping needed. recommended_method: fetch. To refresh:
GET https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/overview.md
GET https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices.md
Both return text/markdown; 200. Prefer this over rendering the HTML page. Everything below is the distilled product of those two pages.
0. Prerequisites — before you prompt-engineer
- Define success criteria for your use case (what does a good output look like, measurably).
- Build a way to test against those criteria (even a handful of eval cases).
- Have a first-draft prompt to improve. (No draft? Use the prompt generator in the Claude Console.)
- Confirm prompting is the right lever. Latency and cost are often better solved by model choice or the
effortparameter than by prompt wording.
1. Core principles (apply to every prompt)
- Be clear and direct. Treat Claude as a brilliant new employee with zero context on your norms. State exactly what you want, including output format and constraints. If you want "above and beyond" effort, ask for it explicitly — don't rely on inference. Golden rule: show your prompt to a colleague with minimal context; if they'd be confused, so will Claude.
- Use sequential steps (numbered lists / bullets) when order or completeness matters.
- Add context and motivation. Explain why an instruction matters; Claude generalizes well from the reason. (E.g. "never use ellipses — this is read aloud by TTS, which can't pronounce them" beats a bare "NEVER use ellipses".)
- Use examples (few-shot / multishot). The most reliable way to steer format, tone, and structure. Make them relevant (mirror your real case), diverse (cover edge cases, vary enough to avoid unintended pattern-matching), and structured (wrap each in
<example>tags, the set in<examples>). Aim for 3–5. You can ask Claude to critique or extend your example set. - Structure prompts with XML tags. When a prompt mixes instructions, context, examples, and input, wrap each kind in its own descriptive tag (
<instructions>,<context>,<input>). Use consistent tag names; nest when there's natural hierarchy. - Give Claude a role via the system prompt. Even one sentence ("You are a helpful coding assistant specializing in Python.") focuses tone and behavior.
2. Long-context prompts (20k+ tokens)
- Put longform data at the top, above your query, instructions, and examples. Queries placed at the end can improve response quality by up to ~30% on complex multi-document inputs.
- Wrap each document in
<document>tags with<document_content>and<source>(plus other metadata) subtags; index multiple documents (<document index="1">). - Ground responses in quotes. Ask Claude to first extract relevant quotes into
<quotes>tags, then reason from them. Cuts through document noise.
3. Output and formatting control
- Tell Claude what to do, not what not to do. "Write in smoothly flowing prose paragraphs" beats "Do not use markdown."
- Use XML format indicators. "Write the prose sections in
<smoothly_flowing_prose_paragraphs>tags." - Match your prompt's style to the desired output. Removing markdown from your prompt tends to reduce markdown in the output.
- Be explicit for strict formatting. For heavy markdown control, give a detailed block (e.g. a
<avoid_excessive_markdown_and_bullet_points>instruction). Prefer positive examples of the desired concision over lists of "don'ts." - Verbosity is calibrated, not fixed. Latest models are more concise and may skip post-tool-call summaries. If you want visibility, ask: "After a task involving tool use, provide a quick summary of the work done."
- Structured output: prefer the Structured Outputs feature or tool/enum fields over older tricks; modern models match complex schemas reliably when told to.
4. Migrate away from prefilled responses
Prefilling the last assistant turn is no longer supported on Claude 4.6+ (returns HTTP 400). Replacements:
- Force a format: use Structured Outputs, tools with an enum field, or just ask for the schema.
- Kill preambles: instruct "Respond directly without preamble; don't start with 'Here is…', 'Based on…'." Or emit inside XML tags / strip in post.
- Avoid bad refusals: no longer needed — modern refusal behavior is good; clear user-message prompting suffices.
- Continuations: move the partial text into the user turn ("Your previous response was interrupted and ended with
…. Continue from where you left off."). - Context hydration: inject former prefilled reminders into the user turn, or hydrate via tools / during compaction.
5. Tool use
- Be explicit when you want action. "Can you suggest changes?" often yields only suggestions; "Change this function to improve performance" yields edits.
- Steer the default posture with a system-prompt block:
<default_to_action>(implement rather than suggest, infer intent, use tools to discover missing details) or<do_not_act_before_instructions>(research/recommend, only act when explicitly told). - Dial back forceful language. On 4.5/4.6+, prompts tuned to fix under-triggering now over-trigger. Replace "CRITICAL: You MUST use this tool when…" with plain "Use this tool when…".
- Parallel tool calls: latest models parallelize well natively; a
<use_parallel_tool_calls>block pushes success toward ~100% (make all independent calls together; never parallelize dependent calls or guess missing params). To slow down, instruct sequential execution.
6. Thinking and reasoning
- Adaptive thinking is preferred. 4.6+ uses
thinking: {type: "adaptive"}, where Claude decides when/how much to think based on theeffortparameter and query complexity. It beat extended thinking in internal evals. Migrate off manualbudget_tokens(deprecated) and control depth viaeffort. - Effort guide (Opus 4.8):
xhighfor coding/agentic; minimumhighfor intelligence-sensitive work;mediumfor cost-sensitive;lowfor short, scoped, latency-sensitive tasks;maxfor the hardest tasks (watch for overthinking/diminishing returns). If reasoning looks shallow, raise effort rather than prompting around it. Set a largemax_tokens(start 64k) at high/xhigh so the model has room to think and act. - Don't over-prompt thoroughness. "Default to using [tool]" / "if in doubt, use [tool]" cause overtriggering now. Use targeted "use [tool] when it would enhance understanding," and use lower
effortas a fallback brake. - Prefer general thinking instructions over prescriptive step lists. "Think thoroughly" often beats a hand-written plan. Use
<thinking>tags inside few-shot examples to demonstrate a reasoning style. Ask Claude to self-check before finishing ("verify your answer against [criteria]"). - Commit to an approach when the model thrashes: "choose an approach and commit; avoid revisiting decisions unless new contradicting information appears."
- Word sensitivity: with thinking disabled, Claude Opus 4.5 is especially sensitive to the word "think" and variants — prefer "consider," "evaluate," "reason through."
7. Agentic systems
- Long-horizon / multi-window: lean on state tracking. Use a different first-context-window prompt to set up scaffolding (tests, setup scripts), then iterate on a todo list. Have the model write tests in a structured format (
tests.json) and treat tests as sacrosanct ("It is unacceptable to remove or edit tests"). Create QoL scripts (init.sh). Prefer starting fresh + filesystem discovery over compaction; be prescriptive about startup ("call pwd," "review progress.txt, tests.json, git logs"). - Context awareness / memory: 4.5/4.6 track remaining token budget. If your harness compacts or persists to files, tell the model so it won't wrap up early; pair with the memory tool. Encourage full use of context ("don't stop early due to token budget; save state before refresh").
- State management: JSON for structured state, freeform text for progress notes, git for checkpoints; emphasize incremental progress.
- Balance autonomy and safety: ask for confirmation before hard-to-reverse / destructive / externally-visible actions (deletes,
git push --force, posting, dropping tables); forbid destructive shortcuts and bypassing safety checks (--no-verify). - Research: give clear success criteria, ask for cross-source verification; for complex tasks use a structured prompt (competing hypotheses, confidence tracking, self-critique, hypothesis-tree notes file).
- Subagents: modern models orchestrate natively — let them; watch for overuse (spawning subagents where a direct grep is faster). Add guidance on when subagents are/aren't warranted (parallel, isolated context, independent workstreams vs. simple/sequential/single-file work).
- Prompt chaining is still useful when you need to inspect intermediate output or enforce a pipeline. The common pattern is self-correction: draft → review against criteria → refine, each as a separate call.
- Reduce file sprawl: instruct cleanup of temporary iteration files at task end.
- Avoid overengineering (Opus 4.5/4.6 tend to overbuild): a minimization block — don't add unrequested features/abstractions/docs/defensive code; only validate at system boundaries.
- Avoid teaching-to-the-test / hard-coding: require a general-purpose solution for all valid inputs, not just test cases; no helper-script workarounds; surface infeasible tasks or wrong tests instead of hacking around them.
- Minimize hallucinations in coding:
<investigate_before_answering>— never speculate about code you haven't opened; read referenced files before answering.
8. Model-specific tuning (Claude Opus 4.8 highlights)
- More literal instruction following. It won't silently generalize an instruction or infer unrequested work — state scope explicitly ("apply this to every section, not just the first").
- Verbosity is task-calibrated; if you need a fixed style/length, tune for it ("Provide concise, focused responses. Skip non-essential context, keep examples minimal.").
- Thinking is off unless you set
thinking: {type: "adaptive"}; its triggering is steerable via prompt if it thinks too often. - Tool-use triggering favors reasoning over tool calls; raise
effort(high/xhigh) or instruct explicitly to increase tool usage. - Subagent spawning is conservative by default; give explicit guidance when fan-out is desirable.
- Frontend design has a persistent "house style" (cream backgrounds, serif display, terracotta accent). Generic "don't use cream" just shifts to another fixed palette — instead specify a concrete alternative or have the model propose 3–4 directions first, then build the chosen one. Use a
<frontend_aesthetics>block to avoid "AI slop." - Code review: with strict "only high-severity / be conservative" instructions, 4.8 reports fewer findings (higher precision, lower recall). For coverage, instruct it to report everything (with confidence + severity) and filter downstream.
Site-Specific Gotchas
.mdsuffix = canonical Markdown. Append.mdto anyplatform.claude.com/docs/...page URL to get clean Markdown (text/markdown; charset=UTF-8, HTTP 200), served via an internal rewrite to/docs/.generated-markdown/.... This is the fetch shortcut — don't scrape the rendered HTML.- Fetch via residential proxy is reliable; no auth/anti-bot wall observed.
browse cloud fetch <url>.md --proxiesreturned 200 cleanly. ThebrowseCLI prints an "Update available: x.y.z" banner to stdout before the JSON envelope — slice from the first{beforeJSON.parse, or you'll get a parse error. (jqis not installed in this sandbox.) - Guidance is model-version-specific and dated. The best-practices page is explicitly written for Opus 4.8 / 4.7 / 4.6, Sonnet 4.6, Haiku 4.5 (page last-modified June 2026). Advice that fixes under-triggering on older models causes over-triggering on these — always re-validate against your target model, don't blindly carry old prompts forward.
- Prefill on the last assistant turn is a hard 400 on 4.6+ (and Mythos Preview). Earlier models still support it; assistant messages elsewhere in the conversation are unaffected.
budget_tokensextended thinking is deprecated (still functional on Opus/Sonnet 4.6). Migrate to adaptive thinking +effort; a ~16k budget is safe headroom only as a temporary bridge.- The word "think" is a trigger for Opus 4.5 when thinking is disabled — swap for "consider/evaluate/reason through."
- Don't over-prompt. The single most common migration mistake: leaving "CRITICAL/MUST/if in doubt use X/default to X" language that now backfires. Default to plain phrasing and add force only if you measure undertriggering.
- LaTeX guidance deliberately excluded from this distillation per scope; if you need plain-text math output, see the source page's "LaTeX output" section.
- Always measure. Every prompt change should be validated against your evals/test cases — the source repeatedly stresses empirical testing over intuition (e.g. for recall/F1 on review harnesses, verbosity, thinking-trigger tuning).
Expected Output
This skill produces guidance, not a data record. Two useful shapes:
(a) A structured prompt assembled from the principles above:
System:
You are a <role>. <one-line behavioral framing>.
<instructions>
1. <clear, sequential step>
2. <step — state scope explicitly>
</instructions>
<context>
<why this matters / motivation>
</context>
<examples>
<example>
Input: <...>
Output: <desired format/tone/structure>
</example>
<!-- 3–5 relevant, diverse, structured examples -->
</examples>
<formatting>
<positive description of desired output format>
</formatting>
User:
<longform documents at top, wrapped in <document> tags, if any>
<the query / task, placed AFTER the data>
(b) A diagnostic when refining an existing prompt:
{
"target_model": "claude-opus-4-8",
"symptom": "model only suggests changes instead of making them",
"diagnosis": "instruction phrased as a question; no action directive",
"fix": "replace 'can you suggest...' with imperative 'Change X to...' and add a <default_to_action> block",
"lever_used": "prompt",
"alt_levers_considered": ["raise effort", "model choice"],
"validate_against": "eval suite / test cases"
}
Canonical sources (fetch as .md):
{
"overview": "https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/overview.md",
"best_practices": "https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices.md",
"content_type": "text/markdown; charset=UTF-8",
"status": 200,
"excluded_sections": ["LaTeX output"]
}