Skip to content

tracing: OpenTelemetry spans for the agent loop (Eve borrow) #358

Description

@erain

Motivation

Research into Vercel's Eve agent framework (introduced June 2026) flagged first-class OpenTelemetry tracing as the cleanest borrow: Eve emits a span per model call and per tool invocation, exportable to Jaeger/Honeycomb/Datadog. glue has zero observability today (grep -ri otel → 0 files), yet it already has the perfect substrate: the loop emits a structured Event stream (EventLoopStart/End, EventTurnStart/End, EventMessageStart/End, EventToolStart/End, EventError), and Session.Subscribe(func(Event)) is a public, additive attach point.

Scope

  • New dependency-light tracing/ package:
    • Recorder that translates the loop event stream into OTel spans with correct nesting: glue.loop (root) → agent.turnllm.generate / tool.<name>. One trace per prompt (reset on each EventLoopStart).
    • GenAI-style span attributes (model, provider, stop_reason, token usage from Message.Usage, tool is_error).
    • TracerProvider setup honoring standard OTEL_* env vars (OTLP endpoint, service name); no-op when unconfigured so there is zero overhead by default.
  • Wire into cmd/glue via Session.Subscribe(recorder.Handle); init + graceful flush in main.
  • Keep the loop and glue core packages free of OTel imports — all otel deps live in tracing/ and cmd/glue.
  • Tests (event→span translation with a recording exporter), docs, and an ADR.

Non-goals: provider HTTP-level span propagation (future), sandbox, durable checkpointing (separate item).

Part of the Eve-borrow research; sibling issue covers evals-as-gates. Tracker #110.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions