Context Engineering Is Not Prompt Engineering

Author: Danial Hasan, CTO @ Squad

The Convergence

Three independent teams just validated the same architectural insight: Google ADK (3 days ago):

“Context is a compiled view over a richer stateful system.”

Stanford/SambaNova ACE paper (October 2025):

“Treat contexts as evolving playbooks that accumulate, refine, and organize strategies.”

Squad (months ago, in production):

“We learned this a few months ago when building our active context management systems.”

When Google’s agent framework team, Stanford researchers, and a startup building multi-agent systems all independently arrive at the same architecture, that architecture is probably correct.

The Problem We Hit

Month 1 of Squad:

Agent A gathers context (5,000 tokens)
Agent A passes everything to Agent B
Agent B receives 5,000 tokens of "history"
Agent B starts saying "As I mentioned earlier..."
Agent B never mentioned anything. Agent A did.

Identity confusion. Agent B hallucinated that it had Agent A’s conversation. This wasn’t a prompt problem. Our prompts were clear: “You are Agent B, the Engineer.” It was a context problem. We flooded Agent B with Agent A’s history, and the model couldn’t distinguish “context I’m receiving” from “conversation I’m having.” The failure rate: 39% of multi-agent handoffs had identity confusion artifacts.

The Wrong Mental Model

Most agent frameworks handle context like this:

context = ""
context += system_prompt
context += user_message
context += tool_result_1
context += tool_result_2
context += agent_response
# ... keep appending forever

This is prompt engineering thinking applied to context. It treats context as a string to optimize, not a system to architect.

The Compiler Mental Model

Source code (what you store):

Sessions
Memory
Artifacts (files)
Full structured state

Compiler pipeline (how you transform):

Named processors
Sequence of passes
Observable transformations

Compiled output (what the model sees):

Working context
Minimal, relevant, scoped to this call

Squad’s Three-Tier Architecture

┌─────────────────────────────────────────────┐
│  TIER 3: IMMUTABLE (Audit Log)             │
│  - All receipts ever generated              │
│  - Storage: S3 / long-term                  │
├─────────────────────────────────────────────┤
│  TIER 2: PERSISTENT (Shared Database)       │
│  - Current task context                     │
│  - Frozen contracts                         │
│  - Storage: Vector DB + Relational DB       │
├─────────────────────────────────────────────┤
│  TIER 1: EPHEMERAL (Working Context)        │
│  - What THIS agent sees for THIS call       │
│  - Compiled from Tier 2 + Tier 3            │
│  - Storage: LLM context window              │
└─────────────────────────────────────────────┘

The Multi-Agent Identity Fix

Wrong: Pass Agent A’s conversation to Agent B as history. Right: Transform Agent A’s outputs into context FOR Agent B.

// Wrong: Copy conversation
const engineerContext = scoutConversation

// Right: Transform to third-person context
const engineerContext = {
  role: "system",
  content: `
    Context from Scout Agent (separate agent):

    - Files identified: ${scout.outputs.files}
    - Patterns detected: ${scout.outputs.patterns}

    You are the Engineer Agent. Use this context to implement.
  `
}

The difference:

Scout’s outputs become Engineer’s context, not history
Clear attribution: “Scout found…” not “I found…”
No identity confusion

Results:

Before: 39% identity confusion
After: 2% identity confusion

Evidence: Before vs After

Before Context Engineering (Month 1-2)

Metric	Value
Average context size	180K tokens
Context relevance	34%
Identity confusion	39%
Task success rate	61%
Cost per task	$2.40

After Context Engineering (Month 4+)

Metric	Value	Change
Average context size	48K tokens	-73%
Context relevance	91%	+168%
Identity confusion	2%	-95%
Task success rate	94%	+54%
Cost per task	$0.77	-68%

The Reach, Don’t Flood Principle

Google’s ADK: “Agents should reach for information via tools, not get flooded with everything upfront.”

// Bad: Flood agent with all possible context
const context = {
  allFiles: await readAllFiles(),           // 50,000 tokens
  allTests: await getAllTestResults(),       // 10,000 tokens
  allDocs: await getAllDocumentation(),      // 30,000 tokens
  // Total: 90,000+ tokens (most irrelevant)
}

// Good: Minimal default + tools to reach for more
const context = {
  task: currentTask,                          // 500 tokens
  contracts: frozenContracts,                 // 300 tokens
  recentContext: last3Turns,                  // 800 tokens
  // Total: 1,600 tokens
}

const tools = [
  readFile,      // Agent reaches for specific files
  runTests,      // Agent reaches for test results
  searchDocs,    // Agent reaches for relevant docs
]

Results:

75% token reduction
+3.2 tool calls per task (agents reaching for what they need)
94% task success rate
68% cost reduction

This is a scaffold post. Full content will include:

Complete compilation pipeline code
Processor architecture details
Google ADK comparison table
Stanford ACE paper insights
Meta MSL validation
Practical implementation guide

The Meta-Point

Four independent teams arrived at the same architecture:

Principle	Google ADK	ACE Paper	Meta MSL	Squad
Storage ≠ Presentation	Sessions vs Working Context	Playbooks vs Delta Updates	Environments vs Evaluations	Tiers vs Compiled View
Explicit Transformations	LLM Flows + Processors	Generator → Reflector → Curator	Verifier Pipeline	Named Processor Chain
Scope by Default	Tools reach for more	Incremental updates	Constraint-scoped	Protocol-based access

This isn’t coincidence. This is convergent evolution toward correct architecture.

Related Reading:

Featured

Architecture & Systems

Governance & Operations

Comparisons & Insights

Context Engineering Is Not Prompt Engineering

The Convergence

The Problem We Hit

The Wrong Mental Model

The Compiler Mental Model

Squad’s Three-Tier Architecture

The Multi-Agent Identity Fix

Evidence: Before vs After

Before Context Engineering (Month 1-2)

After Context Engineering (Month 4+)

The Reach, Don’t Flood Principle

The Meta-Point

Featured

Architecture & Systems

Governance & Operations

Comparisons & Insights

​The Convergence

​The Problem We Hit

​The Wrong Mental Model

​The Compiler Mental Model

​Squad’s Three-Tier Architecture

​The Multi-Agent Identity Fix

​Evidence: Before vs After

​Before Context Engineering (Month 1-2)

​After Context Engineering (Month 4+)

​The Reach, Don’t Flood Principle

​The Meta-Point

The Convergence

The Problem We Hit

The Wrong Mental Model

The Compiler Mental Model

Squad’s Three-Tier Architecture

The Multi-Agent Identity Fix

Evidence: Before vs After

Before Context Engineering (Month 1-2)

After Context Engineering (Month 4+)

The Reach, Don’t Flood Principle

The Meta-Point