Skip to content

fix(shared-runtime): guard shutdown() against Tokio TLS destruction #2169

Open
rachelyangdog wants to merge 1 commit into
mainfrom
rachel.yang/fix-shared-runtime-tls-shutdown-panic
Open

fix(shared-runtime): guard shutdown() against Tokio TLS destruction #2169
rachelyangdog wants to merge 1 commit into
mainfrom
rachel.yang/fix-shared-runtime-tls-shutdown-panic

Conversation

@rachelyangdog

Copy link
Copy Markdown
Contributor

During CPython interpreter finalization, thread-local storage is destroyed before atexit handlers fire. SharedRuntime::shutdown() calls runtime.block_on() which internally calls context::enter() to set up Tokio's CONTEXT thread-local. If that TLS slot is already destroyed, context::enter() panics with "The Tokio context thread-local variable has been destroyed", which PyO3 converts to a pyo3_runtime.PanicException. This causes a crash on every uWSGI worker shutdown when using ddtrace >=4.9.x.

Fix: check Handle::try_current().is_thread_local_destroyed() before calling block_on(). If TLS is gone, return Ok(()) early — the OS will clean up remaining Tokio threads on process exit. This eliminates both the panic and the subsequent 60s hang/SIGKILL.

Reproducer: uWSGI app with lazy-apps=true, ddtrace imported via uwsgi import=, 4 workers. SIGTERM triggers the panic on every worker.

What does this PR do?

A brief description of the change being made with this pull request.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

…uring CPython finalization

During CPython interpreter finalization, thread-local storage is destroyed
before atexit handlers fire. SharedRuntime::shutdown() calls runtime.block_on()
which internally calls context::enter() to set up Tokio's CONTEXT thread-local.
If that TLS slot is already destroyed, context::enter() panics with
"The Tokio context thread-local variable has been destroyed", which PyO3
converts to a pyo3_runtime.PanicException. This causes a crash on every uWSGI
worker shutdown when using ddtrace >=4.9.x.

Fix: check Handle::try_current().is_thread_local_destroyed() before calling
block_on(). If TLS is gone, return Ok(()) early — the OS will clean up
remaining Tokio threads on process exit. This eliminates both the panic and
the subsequent 60s hang/SIGKILL.

Reproducer: uWSGI app with lazy-apps=true, ddtrace imported via uwsgi import=,
4 workers. SIGTERM triggers the panic on every worker.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@rachelyangdog rachelyangdog requested a review from a team as a code owner June 26, 2026 16:14

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5b9a9050b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +242 to +246
tokio::runtime::Handle::try_current(),
Err(ref e) if e.is_thread_local_destroyed()
) {
debug!("Tokio TLS destroyed during interpreter finalization, skipping shutdown");
return Ok(());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Take the runtime before skipping TLS-destroyed shutdown

In this branch shutdown() returns success without taking self.runtime or clearing the registered workers. The condition only proves that the calling thread's Tokio TLS has been destroyed, which can also happen from thread-local destructors during ordinary thread teardown or embedded interpreter finalization while the process continues; in that case the runtime remains available after a successful shutdown and its background workers can keep running until a later drop aborts them without Worker::shutdown(). Please mark the runtime as shut down or perform a non-TLS background shutdown before returning.

Useful? React with 👍 / 👎.

@dd-octo-sts

dd-octo-sts Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 84.02 MB 84.02 MB +0% (+5.22 KB) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 7.76 MB 7.76 MB +0% (+24 B) 👌
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 10.36 MB 10.36 MB +0% (+632 B) 👌
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 95.13 MB 95.13 MB +0% (+4.92 KB) 👌
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 24.93 MB 24.93 MB +0% (+1.50 KB) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 87.33 KB 87.33 KB 0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 181.51 MB 181.50 MB -0% (-16.00 KB) 👌
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 928.21 MB 928.23 MB +0% (+13.14 KB) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 8.12 MB 8.12 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 87.33 KB 87.33 KB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 24.03 MB 24.03 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 47.96 MB 47.97 MB +.02% (+10.45 KB) 🔍
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 21.62 MB 21.62 MB +0% (+1.50 KB) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 88.71 KB 88.71 KB 0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 185.58 MB 185.58 MB +0% (+8.00 KB) 👌
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 921.15 MB 921.17 MB +0% (+12.74 KB) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 6.27 MB 6.27 MB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 88.71 KB 88.71 KB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 25.76 MB 25.76 MB 0% (0 B) 👌
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 45.59 MB 45.60 MB +.02% (+10.25 KB) 🔍
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 74.91 MB 74.92 MB +0% (+4.61 KB) 👌
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 8.61 MB 8.61 MB +0% (+32 B) 👌
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 90.33 MB 90.34 MB +0% (+4.61 KB) 👌
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 10.48 MB 10.48 MB +0% (+504 B) 👌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant