CI: data flywheel (fork verification)#1
Open
smowtion wants to merge 10 commits into
Open
Conversation
Add opt-in active-learning collector to capture uncertain/failed samples: - New collection/ module with DataCollector (writes tiles/failures to collect_dir) - SolverConfig.collect_data flag (default False, zero I/O overhead) - Integration in RecaptchaSolver and AsyncRecaptchaSolver solve() pipelines - Collector records uncertain tiles (confidence between thresholds) and failures - tests/ scaffold: test_collector_scaffold.py, test_data_collector.py
- CI build job asserts the wheel ships collection/ but not training/ tooling - Document the 4-phase data flywheel implementation + the cv2.error/OSError lesson
…ll 4x4 classification fallback - Add YOLODetector.is_supported() to classify challenges into supported (COCO) vs. unsupported (fallback-capable) groups; skip unsupported challenges with minimal latency using short _reload_challenge delay - Refactor solve loop (sync+async symmetric) with separate attempts/skips budgets and solved flag; on classification failure, drop to short token-wait instead of hanging full timeout - Remove _get_target_class; move logic inline to clarify control flow - Implement per-cell 4x4 classification fallback in SquareCaptchaHandler: when COCO model lacks target (e.g., stairs, bridges), classify each cell independently using the 57k classification model as a last resort - Add test_is_supported.py (device/model coverage); test_square_handler_fallback.py (cell classification, missing-class recovery) - Add integration test retry-until-solvable (bounded N=3, wall-clock 180s, still asserts token); now PASSES in 279s (vs. previous fail after 578s) - Update codebase-summary.md with solve robustness + square handler fallback notes - Fix test_class_mapping.py SIM300 ruff violation - Add executed plan + brainstorm report to plans/ Verified: 117 unit tests pass; integration test now stable. Sync+async symmetric; public API unchanged.
Implement full-image collection, cell-bbox annotation, dataset builder, and 3-tier 4x4 detection priority (COCO → custom detection → per-cell fallback). Phase 3 (GPU training) deferred. Custom model gated behind optional config; default off preserves existing behavior. - DataCollector.record_challenge_image: save 4x4 full images to collected/full/ - SquareCaptchaHandler: hook full-image collection (detector.collector injection) - YOLODetector: load/verify custom detection model (SHA256), detection+mapping - SolverConfig: custom_detection_model_path parameter - annotate_detection_cli.py: interactive cell→bbox annotation CLI - prepare_detection_dataset.py: YOLO detection dataset builder (images/labels/data.yaml) - CUSTOM_DETECTION_CLASSES/MAPPINGS: runtime detection class sync - DETECTION_CLASSES: training/class_mapping dataset builder class registry - 3-tier priority solve handler with per-cell fallback - 13 new tests; ruff/mypy clean; 138 tests passing - docs/training-and-flywheel.md: Tier B documentation Public API unchanged. No confidential data or credentials.
- training/device_utils.resolve_device: CUDA > MPS > CPU auto-detect - train.py: --device defaults to auto, add --amp/--no-amp (use --no-amp on flaky MPS) - Apple Silicon (M-series) trains via Metal/MPS; no cloud GPU required for small datasets - docs + Tier B plan notes: Mac MPS option vs Colab; train_detection.py to reuse device_utils
- train_detection.py: YOLO detect trainer (auto CUDA/MPS/CPU, --amp/--no-amp, --resume) - write_model_card.py: model_card.json sidecar (date/task/classes/sha256) - collect.py: loop driver to accumulate per-cell + full-4x4 data with progress counts - class_mapping: enforce DETECTION_CLASSES == types.CUSTOM_DETECTION_CLASSES contract - pyproject: add onnx/onnxslim to dev extras (export-time only; runtime keeps onnxruntime) - smoke-verified train_detection on MPS (synthetic data -> best.pt, task=detect) Real model still needs human-annotated 4x4 data (no auto pseudo-label).
- auto_annotate_capmonster.py: label collected 4x4 images via CapMonster image mode (ComplexImageTask/recaptcha, ~$0.04/1000) -> annotations.jsonl, 0->1-indexed cells - only the 7 detection classes labeled; resilient per-image (network errors skipped) - API key via --api-key or env CAPMONSTER_API_KEY (no secret in repo) - docs: manual vs auto annotation; cap.guru can't do image mode (token only)
solution.answer is a 16-element bool mask (True=cell has target), not a cell-index list. Previous code coerced bools to ints producing wrong cells ([1,2] for all). Now map True flags -> 0-indexed cells. Verified end-to-end: collected images -> auto-labels -> YOLO detection dataset. Caught by spot-checking identical outputs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Intra-fork PR to run CI (lint / test 3.10-3.12 / typecheck / build + wheel-exclusion assertion) on smowtion's fork. Mirror of upstream PR DannyLuna17#8.