Agent Tracing: A Practical Schema for Tool-Using AI

What would you need to know to replay the agent’s mistake?

The tempting answer is to log the final answer and maybe the tool response that came before it. That answer is not useless, but it is too vague to operate. Agent tracing records the ordered events inside an agent run: input, selected context, model calls, tool calls, policy decisions, costs, evaluations, approvals, artifacts, and final state. A good trace makes the failure replayable without reading an entire chat transcript.

Generated hand-drawn illustration of agent session state, turn logs, checkpoints, and approval paths.

Direct answer

Agent tracing records the ordered events inside an agent run: input, selected context, model calls, tool calls, policy decisions, costs, evaluations, approvals, artifacts, and final state. A good trace makes the failure replayable without reading an entire chat transcript.

When this matters

  • A single user request can produce many model and tool events.
  • Engineering, security, and support need the same evidence trail.
  • You want monitoring alerts to point to a debuggable trace, not a log pile.

Failure modes to catch

  • Trace ids are not stable across retries.
  • Tool outputs are logged without the permission mode that allowed them.
  • Costs are aggregated daily instead of attached to each turn.
  • Approvals happen in Slack or email and never rejoin the trace.

Agent trace event schema

GateSignalAction
Identitytrace_id, turn_id, user_id, agent_idCreate before model call
Contextsystem hash, memory keys, source idsStore enough to reconstruct
Stepmodel, prompt hash, tool, resultAppend in order
Controlpolicy, eval, approval, stop conditionRecord decision reason
Outcomeartifact, external mutation, verificationEnd with named state
{
  "trace_id": "trace_01",
  "turn_id": "turn_01",
  "event_type": "tool_call",
  "sequence": 4,
  "agent": "research_agent",
  "tool": {"name": "fetch_url", "risk": "read_only"},
  "policy": {"decision": "allow", "rule": "source_fetch"},
  "input_hash": "sha256:...",
  "output_ref": "artifact://trace_01/source_04",
  "cost": {"input_tokens": 1200, "output_tokens": 220},
  "timestamp": "2026-06-16T18:37:35Z"
}

Running example

A deployment agent publishes the wrong branch. The trace should show the requested branch, resolved branch, shell command, approval packet, deploy result, and post-deploy verification. If one field is missing, the incident review becomes guesswork.

Put it to work

Use the agent trace event schema above as the first version of your production gate. Replace the placeholders with your own agent names, tools, risk classes, thresholds, and approval rules. Then wire it into traces, monitoring, security review, evaluation, and human approval so it changes runtime behavior instead of sitting in a doc.

Frequently Asked Questions

What is agent tracing?

Agent tracing is the structured event log for an agent run. It records the sequence of context, model, tool, policy, cost, eval, approval, and final-action events.

How is LLM tracing different?

LLM tracing usually centers on model spans. Agent tracing includes model spans but also tracks tools, permissions, workflow state, artifacts, and human decisions.

What is the minimum useful trace?

A minimum useful trace has stable ids, ordered events, selected context, tool inputs and outputs, policy decisions, token cost, eval result, approval state, and final outcome.

The Takeaway

A trace is useful when it can answer the incident-review question: what did the agent see, decide, do, and verify?

Sources