RDKB-64847: Observed spike in load and CPU due to cpu_telemetry2_0#384
Open
tabbas651 wants to merge 3 commits into
Open
RDKB-64847: Observed spike in load and CPU due to cpu_telemetry2_0#384tabbas651 wants to merge 3 commits into
tabbas651 wants to merge 3 commits into
Conversation
Reason for change: When DCA top/process marker sampling (processTopPattern) runs concurrently with telemetry HTTP uploads (curl), both CPU-intensive paths compete for resources causing WAN timeouts, FD exhaustion, and failed report uploads on resource-constrained devices. Introduced a CPU contention avoidance window that defers new curl handle acquisitions while DCA top pattern collection is active, preventing two CPU-intensive telemetry paths from running in parallel. Test Procedure: performance testing Risks: Medium Priority: P0 Signed-off-by: Thamim Razith Abbas Ali <[email protected]>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a “CPU contention avoidance window” to prevent DCA top/process sampling from overlapping with telemetry HTTP upload work (curl), reducing CPU spikes and associated upload failures/timeouts on resource-constrained devices.
Changes:
- Added sampling-window APIs (
http_pool_begin_sampling_window/http_pool_end_sampling_window) and refcount signaling to defer new curl handle acquisitions during DCA sampling. - Added
active_requestsunderflow protection and sampling-start drain wait (best-effort) before sampling begins. - Wrapped DCA
processTopPattern()marker processing with sampling-window begin/end calls.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| source/protocol/http/multicurlinterface.h | Exposes new sampling-window APIs for coordinating with DCA sampling. |
| source/protocol/http/multicurlinterface.c | Implements sampling-window refcount + deferral in handle acquisition and improves active_requests guards. |
| source/dcautil/Makefile.am | Adds include path for HTTP protocol headers (but currently missing required link to libhttp.la). |
| source/dcautil/dca.c | Enters/exits sampling window around top/process marker collection to defer concurrent curl work. |
Comment on lines
+511
to
+517
| if (sampling_window_refcount > 0) | ||
| { | ||
| pthread_mutex_unlock(&pool_mutex); | ||
|
|
||
| usleep(POOL_ACQUIRE_RETRY_MS * 1000); | ||
| continue; | ||
| } |
Comment on lines
35
to
39
| -I${top_srcdir}/source/utils \ | ||
| -I${top_srcdir}/source/bulkdata \ | ||
| -I${top_srcdir}/source/protocol/http \ | ||
| -I${PKG_CONFIG_SYSROOT_DIR}$(includedir)/ccsp | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reason for change: When DCA top/process marker sampling (processTopPattern) runs concurrently with telemetry HTTP uploads (curl), both CPU-intensive paths compete for resources causing WAN timeouts, FD exhaustion, and failed report uploads on resource-constrained devices. Introduced a CPU contention avoidance window that defers new curl handle acquisitions while DCA top pattern collection is active, preventing two CPU-intensive telemetry paths from running in parallel.
The upload-side deferral has no hard timeout — DCA always performs finite work and will always release the sampling window. The pool_shutting_down flag serves as the escape hatch for process restart scenarios.
Changes:
with refcount-based signaling in multicurlinterface.c
with a best-effort 2000ms drain wait for in-flight uploads
every 100ms until DCA completes — no timeout, no upload failure
Test Procedure: Performance testing — triggered simultaneous DCA + upload
with kill -12 && kill -10 && kill -29; verified no CPU spike
(cpu_telemetry2_0=0.0%), no handle acquisition failure, all reports
uploaded successfully (HTTP 200) across 4 test scenarios
Risks: Medium
Priority: P0
Signed-off-by: Thamim Razith Abbas Ali [[email protected]]