For years, the "more instructions = better results" approach to prompting dominated. Long system prompts, all-caps constraints, exhaustive rule lists — if something wasn't working, you'd add more words. That approach is now explicitly called out by OpenAI as a problem for their newer models.
The guidance is split by model: GPT-5.5 has its own section, GPT-5.4 has another, and Codex (the agentic coding model) gets a detailed breakdown for software engineering workflows. There's also a short set of universal patterns that apply across all three. Let's go through each.
GPT-5.5: Less Is More
GPT-5.5 is where the guidance is most surprising. The model is described as performing better with shorter, outcome-focused prompts — not the lengthy instruction stacks that became standard with GPT-4. If your system prompt is over 300 words, it's probably hurting more than it's helping on this model.
1. Define the outcome, not the process
Instead of describing every step a model should take, describe what a successful output looks like. GPT-5.5 is capable enough to infer the process — it doesn't need to be hand-held through each step.
2. Separate personality from task instructions
OpenAI specifically recommends separating personality/tone definitions from task-specific instructions. Mixing them creates confusion about which constraint takes priority.
3. Use decision rules instead of absolute constraints
The guidance explicitly says to avoid "ALWAYS" and "NEVER" except for true invariants. These create rigid behavior that breaks on edge cases. Instead, write decision rules that give the model judgment.
4. Set retrieval budgets with explicit stopping conditions
When building RAG systems or agents that search for information, GPT-5.5 needs a clear stopping rule. Without one, it loops on searches trying to find marginally better results. The guidance recommends: "Search up to 3 times. If the result is sufficient after 2, stop."
5. Mark assumptions in creative work
When generating creative content that blends facts with invented content, the guidance recommends explicitly instructing the model to mark assumptions. This prevents confident hallucinations dressed as research.
GPT-5.4: Output Contracts & Verification Loops
GPT-5.4 guidance focuses heavily on structured outputs and verification. The model is designed to follow explicit output contracts and to check its own work — but only if you build that into the prompt.
The Output Contract Pattern
Instead of hoping the model formats its response correctly, define an explicit contract at the start of the system prompt:
This pattern is different from telling the model what to do — it tells the model what the finished output looks like. GPT-5.4 follows these contracts far more reliably than format instructions buried in paragraphs.
The Verification Loop
For high-stakes outputs, OpenAI recommends building a lightweight self-check into the prompt. The model checks four things before finalizing:
- Is the content factually grounded in the provided sources?
- Does it match the requested format?
- Is the length within specified limits?
- Does it avoid the forbidden patterns listed in the prompt?
The Follow-Through Default
One of the most practical additions for agentic apps: define a follow-through rule to prevent the model from stopping to ask permission at every step.
Mini vs Nano: What changes
| Model | Requires | Best for | Avoid |
|---|---|---|---|
| GPT-5.4 | Minimal explicit structure | Complex, multi-step tasks | Overly rigid format constraints |
| GPT-5.4 mini | More explicit structure | Bulk processing, classification | Ambiguous instructions |
| GPT-5.4 nano | Very explicit, narrow scope | Single-task, well-defined jobs | Multi-step reasoning chains |
Reasoning effort as a tuning knob
GPT-5.4's API exposes a reasoning_effort parameter. OpenAI's guidance: start at "none" or "low" for execution tasks; only increase to "medium" or above when the task requires genuine reasoning (multi-step analysis, math, complex debugging). Higher effort costs more and takes longer — don't default to maximum.
GPT-5.3 Codex: Agentic Code Engineering
Codex gets its own section because agentic coding workflows are categorically different from chat. The guidance here is the most detailed — and the most actionable if you're building or using AI coding tools.
Tool hierarchy: always prefer specialized tools
Codex has a preferred order for operations. The guidance says to configure it with this hierarchy: specialized tools first (apply_patch, git, search tools), shell commands last. This prevents fragile string-manipulation workarounds when a proper tool exists.
Parallel tool calling by default
One of the biggest performance improvements: instruct Codex to batch independent operations rather than read files sequentially. Reading 5 files one by one is slow; reading all 5 in parallel is fast.
Autonomy bias: proceed, don't pause
Codex is designed for agentic work. The guidance explicitly recommends an autonomy bias: gather context, plan, implement, test, and refine without asking for additional prompts. Pausing to confirm every decision defeats the purpose of an autonomous coding agent.
Git safety rules are non-negotiable
The guidance is unambiguous: never use destructive git commands (reset --hard, force push, branch -D) without explicit user approval. This should be hardcoded in every Codex system prompt.
Frontend work: no generic layouts
A surprisingly specific instruction for frontend tasks: use intentional design. Avoid placeholder gradients, generic card layouts, and default color schemes. The model should make real visual choices — typography, spacing, color — rather than producing a "looks like every Bootstrap site" output. If you're using Codex for UI work, explicitly say: "Use an intentional, professional design. No generic layouts."
DRY enforcement at model level
Before adding any new function or helper, Codex should search for prior implementations in the codebase. The guidance recommends: "Search for existing implementations before writing new ones. Extract and reuse shared code rather than duplicating."
Universal Patterns: Across All Models
Beyond model-specific guidance, OpenAI identifies patterns that improve results regardless of which model you're using:
What This Means for Your Existing Prompts
If you're using prompts that were written for GPT-4 or GPT-4o, most of them will still work — but you're leaving performance on the table. Here's a quick audit checklist:
- Is your system prompt over 400 words? Cut it in half. GPT-5.5 specifically underperforms with verbose instruction stacks.
- Do you use ALWAYS or NEVER for judgment calls? Convert them to decision rules with context and conditions.
- Do you have a format contract? If not, add one. Define sections, length, and tone in a dedicated block.
- Are you using reasoning_effort on every call? Default to low/none for most tasks; reserve high for hard reasoning problems.
- Does your Codex prompt include git safety rules? If not, add them immediately.
The Bigger Picture
The core insight running through all of OpenAI's guidance is a shift from prescriptive to outcome-oriented prompting. Older models needed to be walked through every step. GPT-5.x models are capable of figuring out the steps — what they need is a clear picture of the destination.
The irony is that better models require shorter, more confident prompts. Trust the model more, control it less, and define "done" very clearly. That's the pattern OpenAI is pointing toward with every generation.
For teams managing multiple prompts across different models and use cases, this also highlights why prompt versioning and organization matters. The right GPT-5.5 prompt looks different from the right GPT-5.4 prompt — which looks different from the right Codex prompt. If all your prompts are hardcoded, updating them as guidance evolves is painful. If they're organized and versioned, it's a 5-minute update.
Save, organize, and version your prompts
PromptChief is built for exactly this — store prompt variants per model, track what works, and reuse your best system prompts across every project. Free to start.
Open PromptChief Free →