System prompts that survive long sessions

System prompts that survive long sessions

Every team writes a careful system prompt and forgets it. The model follows it for the first few turns and then starts ignoring it — not because the prompt is bad, but because conversation history outweighs system instructions in the model’s attention. By turn fifteen, you’re a different system than the one your tests covered.

What drift actually looks like

The format slips first: a model instructed to always respond in JSON starts adding apologetic preambles. Tone slips next: a “professional, concise” persona becomes chatty. Refusal policies erode: the model’s stance on edge cases gets softer the longer the user pushes. None of this is dramatic in any single turn — it’s the cumulative drift that breaks production behavior.

Anchoring patterns that work

Re-inject critical constraints in the system prompt or as a leading user message every N turns. Keep the system prompt short — long prompts get summarized in the model’s internal representation, and summarized constraints lose their teeth. Test with conversations that are 20-plus turns long, not 3-turn happy paths. The drift you don’t measure is the drift that ships.

The system prompt is not where you put everything you want the model to remember. It’s where you put what must survive the conversation.

Related Posts

Mastering prompt engineering for production use

Mastering prompt engineering for production use

Lorem ipsum dolor sit amet consectetur adipisicing ...

Chain-of-thought prompting that holds up under pressure

Chain-of-thought prompting that holds up under pressure

Chain-of-thought prompting is the easiest reasonin ...