Skip to content

feat(otel): end-to-end tracing — Go CLI propagation + platform log correlation (DSPX-3635)#3722

Draft
dmihalcik-virtru wants to merge 2 commits into
mainfrom
DSPX-3635-otelo
Draft

feat(otel): end-to-end tracing — Go CLI propagation + platform log correlation (DSPX-3635)#3722
dmihalcik-virtru wants to merge 2 commits into
mainfrom
DSPX-3635-otelo

Conversation

@dmihalcik-virtru

@dmihalcik-virtru dmihalcik-virtru commented Jul 3, 2026

Copy link
Copy Markdown
Member

DSPX-3635 — End-to-end OpenTelemetry tracing (platform + Go CLI)

Part of the DSPX-3635 effort to thread a single distributed trace from the xtest
pytest harness → SDK CLI → platform/KAS, and to correlate logs with traces.

Companion PR: opentdf/tests#549 (pytest plugin + otdf-local Jaeger wiring)

What's here

  • otdfctl (Go CLI):
    • New pkg/tracing — OTLP/gRPC tracer init that is a strict no-op unless OTEL_EXPORTER_OTLP_ENDPOINT is set (no startup cost/failure for normal users).
    • Root span per command in cmd/root.go, continuing an inbound TRACEPARENT from the environment.
    • otelconnect client interceptor via sdk.WithExtraClientOptions, so platform/KAS calls carry the trace.
    • Context threaded so the KAS Rewrap actually joins the trace: encrypt uses CreateTDFContext, decrypt calls Reader.Init(ctx) before io.Copy (the SDK otherwise unwraps with context.Background()).
    • cli.OnExit hook flushes spans on os.Exit paths (defers don't run there — and failing decrypts are exactly when the trace is wanted).
  • service/logger: ContextHandler now stamps trace_id/span_id from the active span onto every log record → log↔trace correlation.

Testing

  • go build ./... clean (otdfctl + service); go test passing for logger, otdfctl/pkg/{tracing,handlers,cli}; new unit tests added.
  • golangci-lint clean on changed packages.
  • Full live e2e (Jaeger + platform + 6 KAS + pytest) is exercised via the companion tests PR.

Out of scope (follow-ups)

  • Java SDK (java-sdk/cmdline) and JS Node CLI (web-sdk/cli) propagation.
  • Richer KAS rewrap spans, metrics, browser tracing.

Draft: opening for review of the pattern before extending to Java/JS.

Thread end-to-end OpenTelemetry tracing through the Go CLI and add
trace/log correlation on the platform, per DSPX-3635.

- otdfctl: new pkg/tracing (no-op unless OTEL_EXPORTER_OTLP_ENDPOINT is set),
  a root span per command that continues TRACEPARENT from the environment, and
  an otelconnect client interceptor via sdk.WithExtraClientOptions so
  platform/KAS calls carry the trace. Context is threaded into CreateTDFContext
  (encrypt) and Reader.Init (decrypt) so the KAS Rewrap joins the trace; a
  cli.OnExit hook flushes spans on os.Exit paths.
- service/logger: ContextHandler stamps trace_id/span_id from the active span
  onto every log record.

Signed-off-by: Dave Mihalcik <[email protected]>
@github-actions github-actions Bot added the size/m label Jul 3, 2026
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 18af0121-c0a4-47f5-8c96-3a8530dd62d4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DSPX-3635-otelo

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

⚠️ Govulncheck found vulnerabilities ⚠️

The following modules have known vulnerabilities:

  • examples
  • otdfctl
  • sdk
  • service
  • lib/fixtures
  • tests-bdd

See the workflow run for details.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements end-to-end OpenTelemetry tracing for the Go CLI, allowing distributed traces to be threaded from the test harness through the CLI and into the platform and KAS services. By injecting trace and span IDs into log records and ensuring proper context propagation, this change significantly enhances observability and simplifies the debugging of distributed requests across the platform.

Highlights

  • OpenTelemetry Integration: Added OTel tracing to the Go CLI (otdfctl) with OTLP/gRPC support, ensuring it remains a no-op unless explicitly configured.
  • Context Propagation: Implemented context propagation across CLI commands and SDK client calls to link CLI actions with platform and KAS traces.
  • Log Correlation: Updated the logger to automatically stamp log records with active trace and span IDs, enabling seamless log-to-trace correlation.
  • Graceful Shutdown: Introduced an OnExit hook to ensure that OpenTelemetry spans are correctly flushed even when the CLI exits via os.Exit.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.


A trace in the dark, Spans connect the system's heart, Logs now find their way.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor
Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 162.33405ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 91.874701ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 353.38144ms
Throughput 282.98 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.733359772s
Average Latency 396.440269ms
Throughput 125.84 requests/second

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces end-to-end OpenTelemetry tracing support across the Go CLI (otdfctl), platform, and KAS, allowing trace propagation from client commands to downstream services and correlating logs with active traces. Feedback on the changes suggests filling or removing empty placeholder sections in the design specification document (spec/DSPX-3635.md) and logging a warning or debug message if OpenTelemetry initialization fails instead of silently ignoring the error.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread spec/DSPX-3635.md
Comment on lines +46 to +63
## Problem / Motivation
_Why does this work need to happen? What is the user/business pain?_

## Proposed Solution
_What will you build, at a functional level? Sketch the approach._

## Inputs / Outputs / Contracts
_Function signatures, data shapes, API contracts, CLI flags._

## Edge Cases & Constraints
_Boundary conditions, error states, performance limits, security considerations._

## Out of Scope
_What this work item explicitly does not cover._

## Acceptance Criteria
- [ ] _Clear, testable condition_
- [ ] _…_

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The specification file contains several empty placeholder sections (e.g., Problem / Motivation, Proposed Solution, Inputs / Outputs / Contracts, etc.) with template instructions. Since the summary section at the top already contains a lot of details, please either fill in these sections with the relevant details or remove them to keep the document clean and professional.

Comment on lines +41 to +46
exporter, err := otlptracegrpc.New(ctx, exporterOptions(endpoint)...)
if err != nil {
// Tracing is best-effort: never block the CLI because a collector is
// unreachable or misconfigured.
return func() {}, false
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When OTEL_EXPORTER_OTLP_ENDPOINT is set but otlptracegrpc.New fails to initialize (e.g., due to an invalid endpoint format), the error is silently ignored. Since slog is already initialized before tracing.Init is called, it would be highly beneficial to log a debug or warning message here to aid in troubleshooting misconfigured tracing environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant