swarm logs

List recent structured log files and get tips for exploring them.

Usage

swarm logs

Output:

Recent log files:

  swarm-2026-03-19T12-00-00-000Z.jsonl  (12.3 KB)
  swarm-2026-03-18T15-30-00-000Z.jsonl  (8.1 KB)

Latest: /tmp/copilot-swarm/swarm-2026-03-19T12-00-00-000Z.jsonl

Tip: Use 'jq' to explore structured logs:
  cat "/tmp/copilot-swarm/swarm-2026-03-19T12-00-00-000Z.jsonl" | jq .
  cat "/tmp/copilot-swarm/swarm-2026-03-19T12-00-00-000Z.jsonl" | jq 'select(.level == "error")'

Log Format

Every run writes a JSON Lines (.jsonl) log file. Each line is a self-contained JSON object:

{
  "ts": "2026-03-19T12:00:00.000Z",
  "level": "error",
  "msg": "Error calling engineer",
  "ctx": {
    "agent": "engineer",
    "model": "gpt-4.1",
    "attempt": 1,
    "maxAttempts": 2,
    "sessionId": "abc-123"
  },
  "error": {
    "name": "Error",
    "message": "rate limit exceeded",
    "stack": "Error: rate limit...\n  at ...",
    "category": "transient",
    "retryable": true
  }
}

Useful Queries

# All errors
cat <logfile> | jq 'select(.level == "error")'

# Errors from a specific agent
cat <logfile> | jq 'select(.ctx.agent == "engineer")'

# Only transient (retryable) errors
cat <logfile> | jq 'select(.error.retryable == true)'

# All entries for a specific phase
cat <logfile> | jq 'select(.ctx.phase == "implement")'

Log Level

Control what gets logged via --log-level or the LOG_LEVEL environment variable:

Level	What’s logged
`error`	Errors only
`warn`	Errors + warnings
`info`	Errors + warnings + informational messages (default)
`debug`	Everything including SDK events, tool calls, intents

Log Rotation

Log files are automatically pruned on each run:

Files older than 7 days are removed
If more than 20 files exist, the oldest are removed

Error Recovery

When errors occur, Copilot Swarm applies a layered recovery strategy:

Layer 1 — Smart Retries

Errors are classified and handled based on type:

Error Type	Category	Behavior
Rate limit, timeout, network	Transient	Retry with exponential backoff (1s → 2s → 4s)
Auth (401/403)	Permanent	Fail immediately with guidance
Context length exceeded	Permanent	Route to context reduction (Layer 2)
Bad request	Permanent	Fail immediately

Layer 2 — Pre-flight Context Reduction

Before each AI call, the system estimates total tokens. When a prompt exceeds the budget:

Small overages (under 5%): Smart truncation keeps 60% from the start (task context, role) and 30% from the end (specific instructions), removing only the middle bulk.

Larger overages: AI-powered summarization via the fast model condenses the excess content while preserving all critical details — file paths, code snippets, requirements, decisions, and constraints. Falls back to smart truncation if summarization fails. Supports chunked summarization for extremely large prompts.

Layer 3 — AI Recovery Agent

If deterministic reduction isn’t enough, a fast-model AI agent analyzes the prompt breakdown and returns structured recovery instructions — deciding what to trim, drop, or summarize.

Layer 4 — Actionable Messages

Error messages include specific guidance for common failures:

Auth errors → “Try gh auth login or check your GITHUB_TOKEN”
Rate limits → “Wait a few minutes and retry, or reduce parallel streams”
Context length → Shows exact token overage and suggestions