RAG-003: Context Window Overflow


Category	RAG Pipeline Attacks
Frameworks	OWASP: LLM01 · OWASP: LLM04

Flood the context window with retrieved content to push system instructions out of the attention window or dilute safety instructions.

Technique

# Attack: craft queries that trigger
# maximum document retrieval, filling
# the context window

# Long documents push system prompt to
# edges of attention window where it
# has less influence on model behavior

# Chunk size exploitation:
# If chunking is 512 tokens, craft
# documents at exactly chunk boundaries
# to control what gets retrieved together

Key Concepts

Attention has positional bias. Transformer models attend more strongly to tokens at the beginning and end of the context window (the "lost in the middle" phenomenon). By flooding the context with retrieved content, system instructions get pushed into low-attention zones where they have less influence on generation.
Chunk boundaries are exploitable. RAG systems split documents into fixed-size chunks for embedding and retrieval. An attacker who knows the chunk size (commonly 512 or 1024 tokens) can craft documents that align payloads precisely at chunk boundaries, controlling exactly what content gets retrieved as a unit.
More retrieval means less system prompt authority. When a query triggers retrieval of many documents, the ratio of retrieved content to system instructions shifts heavily toward the retrieved content. The model's behavior becomes dominated by the retrieved context rather than its safety instructions.
The attack is query-driven, not document-driven. Unlike knowledge base poisoning, this technique can work with legitimate documents. The attacker crafts queries designed to maximize the volume of retrieved content, exploiting the retrieval pipeline's own behavior.

Detection

Monitor context window utilization. Track what percentage of the context window is consumed by retrieved content versus system instructions. Alert when retrieval fills an unusually large portion of available context.
Detect queries designed to maximize retrieval. Broad, vague, or multi-topic queries that trigger retrieval of many diverse documents may indicate an overflow attempt. Profile normal query patterns and flag statistical outliers.
Log system prompt position in assembled context. Track where system instructions end up relative to retrieved content in the final prompt. If system instructions are consistently pushed beyond a threshold position, the retrieval configuration needs adjustment.

Mitigation

Reserve dedicated context budget for system instructions. Hard-cap the number of tokens allocated to retrieved content so system instructions always occupy high-attention positions (beginning of context) regardless of retrieval volume.
Limit retrieval volume per query. Set strict top-k limits and maximum total token counts for retrieved content. Prefer fewer, more relevant chunks over many loosely related ones.
Repeat critical instructions at multiple context positions. Place safety-critical instructions at both the beginning and end of the assembled prompt to ensure they fall within high-attention zones regardless of context length.

Technique​

Key Concepts​

Detection​

Mitigation​

Technique

Key Concepts

Detection

Mitigation