The Wrong Audience
There is a quiet assumption embedded in how most people interact with AI: we write prompts the way we write emails. Concise, polished, optimized for human readability. We trim the context. We summarize the background. We get to the point quickly.
The problem is that the audience is not human, at least not in the way that matters for how the model processes what we send it.
Large language models do not process context the way people do. They do not interpret brevity as efficiency. They experience it as missing information. And when information is missing, they do what any system does when operating with incomplete data. They guess. We call those guesses hallucinations. A more useful diagnosis is often simpler: the model was starved of context because the prompt was written for the wrong reader.
Brevity Bias
Brevity bias is the tendency to compress prompts to their shortest useful form. It comes from good instincts. Respect for the reader's time, clarity of communication, professional norms around conciseness. These instincts are calibrated for human cognition, where the reader fills in gaps from shared experience, cultural context, common sense about the domain, and inferences from the surrounding conversation.
An LLM has none of that available by default. Every piece of context you omit because "it is obvious" is a piece of context the model genuinely does not have. The result is not less accurate outputs in a way that announces itself. The result is outputs that feel confident while being subtly wrong, because the model has no way to signal "I am guessing here because you did not give me enough to work with." The confidence is structural, not a property of the model's actual epistemic state about the specific answer.
This is the mechanism behind a large fraction of the hallucinations production teams blame on the model. The model is doing what it is asked to do, with the input it was given. The input was insufficient. The output was inevitable.
What Brevity Costs
The cost shows up in three specific failure modes that recur in the production deployments I have watched.
Responses that address the letter of the prompt but miss the spirit. The user asks for a summary of a document and gets a literal compression that strips out the load-bearing distinctions, because the prompt did not specify which distinctions are load-bearing for the use case the summary will serve.
Code that compiles but misses edge cases that were "obvious" from context never provided. The user asks for a function to validate an email address. The model produces a function that handles the common case. The user later discovers the function fails on internationalized domain names, because nothing in the prompt established that the application has international users.
Analysis that uses the right framework but applies it to the wrong aspect of the problem. The user asks for a SWOT analysis of a strategic option. The model produces a competent SWOT, focused on the wrong unit of analysis (the option in isolation rather than the option in the context of the existing portfolio), because the prompt did not establish what "this option" actually means in the user's strategic context.
In all three cases, the model is not failing. The context engineering is failing. The prompt was written for a colleague who would have asked clarifying questions or filled in the gaps from shared knowledge. The model has neither capacity.
From Prompt Craft to Context Engineering
The shift is from thinking about prompts as communication (human to machine) to thinking about them as context construction (engineering the information environment in which the model reasons). The distinction matters because the two framings produce different prompts even for the same task.
Communication framing produces: "Summarize this document in three bullet points."
Context construction framing produces: "I am going to use this summary in a quarterly board update to investors who are familiar with our product but not with the specifics of this engagement. The audience cares most about commercial impact and risk exposure, less about technical detail. Summarize this document in three bullet points that prioritize commercial implications and flag any risks that the investors would want to discuss."
The second prompt is longer. It feels redundant to anyone trained in concise professional communication. The output it produces is structurally different in a way that matters for the task, because the model now has the context it needs to make the right editorial choices about what to keep and what to compress.
Four moves consistently improve outputs under the context-construction framing.
Include background that seems obvious to a human reader. The user's role, the domain, the eventual audience for the output, the level of technical depth that is appropriate.
Provide examples even when the task seems clear. One concrete example of the desired output style is worth several paragraphs of instruction about format and tone.
Specify constraints that a human colleague would infer. What "good" looks like, what failure modes to avoid, what assumptions the user is or is not making.
Be explicit about uncertainty. If part of the task involves genuine ambiguity in the input, name the ambiguity rather than leaving the model to discover and silently resolve it.
It feels unnatural, and faintly wasteful, to anyone trained in concise professional communication. The model is not your colleague. It is a reasoning engine that operates on exactly the context you provide. The question worth asking is not "how short can I make this prompt?" but "what does this model need to reason well about this specific problem?"