Skip to content

New upgrades#20

Merged
jpdeleon merged 10 commits into
mainfrom
upgrade
Apr 9, 2026
Merged

New upgrades#20
jpdeleon merged 10 commits into
mainfrom
upgrade

Conversation

@jpdeleon

@jpdeleon jpdeleon commented Apr 8, 2026

Copy link
Copy Markdown
Owner
  • Serialize pipeline jobs: Replace per-job threads with a single-worker queue to fix "I/O operation on closed file" crashes caused by thread-unsafe lightkurve/astropy/matplotlib global state
  • Replace sys.exit() with typed exceptions: Introduce NoDataError, InvalidInputError, PipelineError so the GUI can catch and display errors instead of crashing the worker thread
  • Remove compat.py: Drop the importlib_resources backport shim (stdlib importlib.resources.files() is stable on Python 3.10+)
  • Add each-sector mode: New "Run each available sector" checkbox in the GUI, with /available-sectors and /each-sector-submit endpoints
  • Thread-safe I/O: Add _ThreadLocalStream for per-thread stdout/stderr routing; protect shared state with locks
  • Harden error handling: Narrow bare except Exception to specific types throughout tql.py and utils.py
  • Fix GLS phase-fold plot: Sort by phase before plotting so the model renders as a smooth curve
  • Fix packaging: Include quicklook/app in wheel (fixes ql-gui for pip installs), exclude runtime output files, add templates as package data
  • Update CI: Upgrade GitHub Actions to v4/v5, test on Python 3.10 + 3.12, run pytest directly, add build verification step

jpdeleon and others added 10 commits April 6, 2026 10:59
- Add get_available_sectors() helper function in tql.py to query available
  TESS sectors for a given target and pipeline combination
- Add --each-sector flag to run ql on each available sector individually
- Add -j/--jobs argument to control parallel execution (default=1)
- Use ProcessPoolExecutor for parallel sector processing
- Each sector runs in a separate process with progress output
- Continue on failure (fail-safe mode)

Example usage:
  ql --name TOI-1234 --each-sector -j 2 -save -verbose
  ql --name TIC89071445 --each-sector --pipeline qlp -j 4
…t loops

Dependencies (pyproject.toml):
- numpy==1.23 -> numpy>=2.0
- lightkurve==2.5 -> lightkurve>=2.5
- transitleastsquares==1.32 -> transitleastsquares>=1.32
- wotan==1.10 -> wotan>=1.10
- reproject==0.12.0 -> reproject>=0.13
- astroplan==0.9 -> astroplan>=0.10
- Replace flammkuchen==1.0.3 (incompatible with numpy 2) with h5py>=3.10
- Bump requires-python from >=3.8 to >=3.10
- Move ruff select to [tool.ruff.lint] and ignore pre-existing B008/B905

HDF5 I/O (new quicklook/h5io.py):
- Simple save/load replacement for flammkuchen using h5py directly
- Stores numpy arrays as HDF5 datasets, scalars/strings/dicts as JSON attrs
- Handles numpy scalar types via custom JSON encoder
- Updated tql.py and cli/read_tls.py to use h5io instead of flammkuchen

Performance optimizations (no logic changes):
- gls.py: Vectorize _calcPeriodogram — replace per-frequency Python loop
  with chunked matrix operations (np.cos/sin on (nf, N) array + matmul).
  Auto-tuned chunk size balances speed vs memory. ~10-50x speedup on GLS.
- measure.py: Vectorize _get_contour_segments — prefilter trivial cases,
  vectorize fraction computation with np.where, use lookup table for
  segment pairs instead of per-pixel if/elif chain.
- tglc.py: Vectorize star loops in Source.__init__ and Source_cut.select_sector
  — batch wcs.all_world2pix calls, vectorize magnitude/in_frame computation,
  use dict for TIC ID lookups instead of repeated np.where searches.
- tql.py: Eliminate redundant MAST query — filter the first
  lk.search_lightcurve result locally instead of issuing a second call.

Bug fix:
- utils.py: get_available_sectors now resolves target name to TIC ID via
  ExoFOP before querying lightkurve, matching TessQuickLook behavior.
  Fixes --each-sector failing with "Could not resolve" for names that
  lightkurve cannot resolve directly (e.g. Gaia DR2 IDs). Also accepts
  optional tic_id parameter to skip redundant ExoFOP lookups.

Other:
- compat.py: Simplify by removing Python <3.9 fallback paths
- README.md: Update example conda env to python=3.12, add rank_tls docs
- tglc.py: Remove unused variables (interval, num_gaia)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add get_tglc_lc() in tglc.py: runs the full TGLC ePSF pipeline
  (Source_cut → get_psf → fit_psf → fit_lc → bg_mod) and returns a
  lightkurve.TessLightCurve with calibrated PSF flux
- Add _get_tglc_lc_fallback() in tql.py: called from get_lc() at three
  failure points (empty search, author not on MAST, empty filtered
  results) so --pipeline tglc works even without MAST HLSP products
- Add TGLC sector fallback in utils.py: get_available_sectors() returns
  all FFI sectors when pipeline=TGLC and no TGLC products exist on MAST,
  fixing --each-sector --pipeline tglc returning "No sectors found"
- Replace all print()/warnings.warn() in tglc.py with loguru logger
  calls (info/warning/debug) activated by verbose flag via loguru
  enable/disable
- Remove unused `import warnings` and `background` variable from tglc.py
- Update tglc.py top docstring to reflect all adaptations from original

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Replace Gaia.cone_search_async with Gaia.cone_search in both
  Source.search_gaia and Source_cut.__init__ for consistency
- Add retry loop (5 attempts, 10s backoff) to Source_cut Gaia query,
  matching the existing retry pattern in Source.search_gaia
- Re-raise after exhausting retries instead of silently continuing

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Web GUI (app.py, index.html, gallery.html):
- Replace SSE polling with WebSocket (flask-sock) for live log streaming
  and bidirectional cancel commands
- Add job queue supporting multiple concurrent targets with status
  badges (running/done/error/cancelled) and a "Watch" button to switch
  between active jobs
- Add cancel button that sets a threading.Event flag and transitions
  the job to "cancelled" status
- Catch SystemExit in background job so sys.exit() calls in
  TessQuickLook properly mark the job as "error" instead of hanging
- Add POST /delete-output endpoint to remove PNG and associated H5
  files; delete buttons on both index recent-results cards and gallery
  cards (inline + modal)
- Add GET /jobs-json endpoint for queue status polling
- Show recent results grid on index page load (scans outputs/)
- Progress bar with step tracker parsing log output for pipeline stages
- Gallery now passes dicts (path + name) instead of plain strings for
  proper delete support

Config and docs:
- Rename optional dependency group "web" -> "gui" and script
  "ql-web" -> "ql-gui" in pyproject.toml
- Add flask-sock to gui dependencies
- Update README: document --each-sector and -j/--jobs arguments with
  examples; fix install command to use [gui]

Bug fix:
- Replace np.lib.pad with np.pad in plot.py (removed in numpy 2.0)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…imizations

Web GUI (app.py, index.html, gallery.html, compare.html):
- Add SQLite job persistence with _init_db, _save_job_history, _get_avg_step_times
- Replace traditional POST with AJAX /submit endpoint and add /batch-submit
- Add bulk delete, image comparison page, gallery pagination and sorting
- Add WebSocket step timing, ETA display, and running-jobs banners on all pages
- Add dark mode via CSS custom properties with localStorage persistence
- Add toast notifications, browser Notification API, keyboard shortcuts (Ctrl+Enter)
- Add batch job queue with auto-advance, Watch button with scrollIntoView
- Add sessionStorage for cross-page active target state
- Reorganize form into 4 collapsible sections: Target, Detrending, Period Search, Output
- Add form state localStorage persistence, inline validation, and help tooltips

Pipeline fixes (tglc.py, tql.py, plot.py):
- Fix Gaia TAP query failures caused by space in TIC names (e.g. "TIC 89071445")
- Add _gaia_async_query() with 3 retries and make convert_gaia_id() fail gracefully
- Add retry logic with automatic cutout size reduction on TESScut timeout
- Fix pipeline hang in plot_odd_even_transit: clip folded light curve to ±2×t14
  transit window before binning to avoid ~1,666 empty bins for long-period planets
- Apply same binning fix to plot_secondary_eclipse
- Add empty-array guards on even_lc/odd_lc for sparse transit data

CLI and I/O fixes (read_tls.py, rank_tls.py, h5io.py):
- Fix read_tls CSV output: add legacy deepdish h5 format support in h5io.load()
  with _decode_attr() and _load_legacy_group() for tuple Groups
- Cast per_transit_count from float64 to int in read_tls output
- Fix rank_tls TypeError: parse per_transit_count string tuples with ast.literal_eval

Performance (__init__.py):
- Replace eager star imports with lazy __getattr__-based imports
- CLI startup reduced from ~5s to ~0.1s by deferring heavy dependency loading

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…tropy/matplotlib

lightkurve, astropy, and matplotlib share global state (file handles,
figure managers, FITS caches) that is not thread-safe. Running multiple
pipeline jobs concurrently via batch or each-sector submission caused
"I/O operation on closed file" errors and other race conditions.

Backend changes (app.py):
- Add serial job queue with a single daemon worker thread (_job_worker)
  so only one pipeline runs at a time; jobs wait in FIFO order
- Replace per-job Thread spawning in _submit_job() with queue.put()
- Introduce "queued" status: jobs start as queued, transition to
  "running" when the worker picks them up
- Add early cancellation check: jobs cancelled while queued are skipped
- Add _ThreadLocalStream wrapper for sys.stdout/stderr so concurrent
  log capture doesn't stomp across threads
- Protect shared state (jobs dict, avg_step_cache) with threading.Lock
- Add /available-sectors and /each-sector-submit endpoints
- Handle QuickLookError separately from unexpected exceptions
- Fix results-json hyphen handling (TOI-4152 → TOI4152 for file lookup)
- Narrow bare except clauses to specific exception types
- Manage log file handle explicitly (open/close in finally) to prevent
  FD leaks

Frontend changes (index.html):
- Add "Run each available sector" checkbox with form persistence
- Each-sector mode queries /available-sectors then submits via
  /each-sector-submit
- Add "queued" to badge class mapping and auto-watch logic so queued
  jobs display correctly in the queue panel
- Show "Queued" step label in progress bar while job awaits execution
- Auto-reconnect WebSocket to queued (not just running) jobs on page load

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…error handling

Python 3.12 fully supports importlib.resources.files(), so the compat
shim (which also leaked a context manager via as_file().__enter__()
without __exit__()) is no longer needed.

compat.py removal:
- Delete quicklook/compat.py
- Replace get_data_path("quicklook") with files("quicklook").joinpath("data")
  in tql.py and utils.py

Exception handling overhaul (tql.py):
- Replace all logger.error() + sys.exit() patterns with raising typed
  exceptions: NoDataError, InvalidInputError, PipelineError
- Narrow bare `except Exception` clauses to specific types
  (KeyError, TypeError, ValueError, IndexError, OSError, etc.)
- Downgrade non-fatal errors from logger.error to logger.warning/debug
- Handle missing planet_params gracefully (return None tuple instead
  of crashing)

Pipeline fixes (tql.py):
- Fix GLS model phase-folded plot rendering: sort by phase before
  plotting so the model draws as a smooth curve instead of zigzag lines
- Cast masked_lc arrays to float via np.asarray() in init_gls() to
  avoid dtype issues with numpy 2.0

Utils hardening (utils.py):
- Add timeout to ExoFOP urlopen calls
- Replace assert with proper ConnectionError for HTTP failures
- Handle empty/unparseable dates in get_params_from_exofop gracefully
- Return None instead of crashing when ExoFOP params are missing
- Narrow except clauses throughout

Co-Authored-By: Claude Opus 4.6 <[email protected]>
CI workflow (build-and-test.yml):
- Upgrade actions to v4/v5 (checkout, setup-python, cache) to avoid
  Node 16 deprecation warnings
- Test on Python 3.10 and 3.12 (was only 3.10)
- Run pytest directly instead of through tox (simpler, fewer layers)
- Add fail-fast: false so all matrix jobs run even if one fails
- Add build verification step (python -m build) to catch packaging
  issues in CI
- Install gui extras so app imports are tested
- Remove unused tox virtualenv cache

Release workflow (release.yml):
- Upgrade actions to v4/v5
- Build with Python 3.12 instead of 3.10

Packaging (pyproject.toml):
- Include quicklook.app in the wheel (was excluded, breaking ql-gui
  entry point for non-editable installs)
- Include templates/*.html and *.mplstyle as package data
- Exclude static/outputs/ from wheel (runtime artifacts, not source)
- Remove importlib_resources dependency (stdlib on Python 3.10+)

Other:
- Add quicklook/exceptions.py (NoDataError, InvalidInputError,
  PipelineError) — used by tql.py and app.py but was untracked
- Gitignore quicklook/app/static/outputs/ and untrack previously
  committed output files (PNGs, H5s)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
batman-package (transitive dep via transitleastsquares) imports
distutils.ccompiler, which was removed in Python 3.12. Adding
setuptools as a runtime dependency provides the distutils shim.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@jpdeleon jpdeleon merged commit 5daf755 into main Apr 9, 2026
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant