fix(deploy): back geckodriver stdout log with a regular file, not a FIFO (no SIGABRT on teardown)#1209
Draft
vringar wants to merge 1 commit into
Draft
fix(deploy): back geckodriver stdout log with a regular file, not a FIFO (no SIGABRT on teardown)#1209vringar wants to merge 1 commit into
vringar wants to merge 1 commit into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1209 +/- ##
==========================================
+ Coverage 62.12% 62.34% +0.21%
==========================================
Files 40 40
Lines 3918 3909 -9
==========================================
+ Hits 2434 2437 +3
+ Misses 1484 1472 -12 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
geckodriver was killed with SIGABRT (exit 134 + coredump) on every browser teardown/recycle. Its stdout was routed through a named FIFO whose reader (FirefoxLogInterceptor, a daemon thread) could close before geckodriver stopped writing. The next write then returned EPIPE, which geckodriver's Rust runtime escalates to a panic; a destructor run during unwinding writes again, double-panics, and aborts. Back the interceptor with a regular file instead. Writes to a regular file never raise EPIPE, so geckodriver can no longer be aborted by reader/teardown ordering. To keep cleanup leak-proof, the interceptor opens the read end in start(), and deploy_firefox unlinks the temp file's path immediately after opening the write end. Both fds keep the now-anonymous inode alive until the process exits, so no owpm_driver_*.log can leak into $TMPDIR -- even under the multiprocess os._exit() and SIGKILL teardown paths, where no finally/atexit hook would run. The tail thread stays a daemon and is reaped at process exit; it buffers partial reads until a newline so a driver line is never split across two log entries. Add a pyonly unit test exercising the interceptor in isolation: lines reach the logger, the temp file is unlinked-but-readable-via-fd (no leak), and split writes are not fragmented. Closes #1208
7bb0cc7 to
3bfbf78
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
geckodriver was killed with SIGABRT (exit 134) and dumped core on every browser teardown/recycle. OpenWPM routed geckodriver's stdout through a named FIFO (
FirefoxLogInterceptor), whose reader — a daemon thread — could close the read end before geckodriver stopped writing. The nextwrite()to stdout then returnedEPIPE, which geckodriver's Rust runtime escalates to a panic; aDrophandler run during unwinding writes to stdout again, double-panics (panic in a destructor during cleanup), and callsabort()→ SIGABRT + coredump. With N parallel browsers, each recycle produced N coredumps.Evidence (per #1208): nine byte-for-byte identical coredumps over one crawl, arriving in bursts of 3 (= 3 parallel browsers), each containing:
The fix
Back
FirefoxLogInterceptorwith a regular file instead of a FIFO. Writes to a regular file never raiseEPIPE, so geckodriver can no longer be aborted by interceptor/teardown ordering — regardless of thread/process exit order. This removes the failure mode entirely (rather than just narrowing the race).selenium_firefox.py: the interceptor creates a regular tempfile (tempfile.mkstemp), tails it (reads new lines as they appear), and stops on request via an explicitthreading.Event(stop()), draining any remaining lines before unlinking the temp file. Thedaemonflag still guarantees the thread never blocks process exit. The now-unusedmktempfifohelper is removed.deploy_firefox.py:log_outputpoints at the regular file; the interceptor is returned to the caller so teardown can stop it.browser_manager.py:run_impl'sfinallycallswebdriver_interceptor.stop()(after the driver is already quit/killed, so no write can race), draining final lines and removing the temp file promptly.Logging is preserved: geckodriver stdout still reaches
self.logger.debugasBROWSER i: driver: ....Verification
EPIPE— confirming both the cause and that the fix removes it.pytest -m pyonlygreen;mypyclean on both deploy_browsers files;black/isortclean; pre-commit green.Closes #1208