scripts: ⚡ Bolt: Optimize getBuildPaths in dl_cleanup.py#181
Conversation
💡 What: Replaced the O(N*M) nested loop calling `os.path.exists()` with an O(M) pre-scan using `os.scandir()` and a dictionary cache. 🎯 Why: `getBuildPaths` was making excessive file system stat calls, checking for every package's existence across every subdirectory in `build_dir/`. For many packages, this I/O bottleneck significantly slows down the cleanup script. 📊 Impact: Reduces the time complexity from O(N*M) I/O checks to O(M) directory scans and O(1) dictionary lookups per package, drastically speeding up execution on large build directories. 🔬 Measurement: Run `scripts/dl_cleanup.py` on a large populated `build_dir` and measure the total execution time before and after the change. Signed-off-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: manupawickramasinghe <[email protected]>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
💡 What: Replaced the O(NM) nested loop calling
os.path.exists()with an O(M) pre-scan usingos.scandir()and a dictionary cache.🎯 Why:
getBuildPathswas making excessive file system stat calls, checking for every package's existence across every subdirectory inbuild_dir/. For many packages, this I/O bottleneck significantly slows down the cleanup script.📊 Impact: Reduces the time complexity from O(NM) I/O checks to O(M) directory scans and O(1) dictionary lookups per package, drastically speeding up execution on large build directories.
🔬 Measurement: Run
scripts/dl_cleanup.pyon a large populatedbuild_dirand measure the total execution time before and after the change.Signed-off-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
PR created automatically by Jules for task 11918502149959167457 started by @manupawickramasinghe