fix: audit-remediation — address all 15 code review findings#25
Merged
Conversation
Findings 1–15 from the June 2026 independent code review. Thread safety (F1) - FullCovarianceGaussianDistribution, DiagonalGaussianDistribution: add mutable std::mutex cache_mutex_ with double-checked locking in updateCache() to prevent concurrent writers on chol_L_/log_det_/ log_var_/inv_var_. Explicit copy/move operations defined since std::mutex is non-copyable. Null safety (F2) - BasicHmm<Obs>::getDistribution() now throws std::runtime_error when the emission slot is null, replacing a silent null dereference. JSON correctness (F3, F12, F13) - from_json / from_json_mv: validate each key name before consuming its value; reordered JSON now throws instead of silently scrambling parameters. - from_json_mv: cross-validate per-distribution dim against the manifest dimensions field. - FullCovarianceGaussianDistribution::from_json, DiagonalGaussianDistribution::from_json: bound dim to [1, 1024]. Statistical correctness (F4, F5, F6) - VonMisesDistribution::wrap_angle: change x <= -pi to x < -pi. - FullCovarianceGaussianDistribution: store unregularised cov_; apply reg*I only to a scratch copy for factorisation. - VonMisesDistribution::fit(weighted): accumulate sumW inside the finite-data filter loop. Input validation (F7, F8, F9, F10) - MapBaumWelchTrainer: use !(c >= 0.0) guard to reject NaN. - kmeans_init / seed_kmeanspp: throw when pts.size() < K. - MV distribution fit(): throw on ragged observations. - BasicViterbiTrainer: throw when convergenceWindow < 2. API consistency (F11, F14, F15) - StudentTDistribution::getCumulativeProbability(NaN): return 0.0. - MapBaumWelchTrainer::apply_discrete_smoothing: re-normalise. - kmeans_init::lloyd_update: retain previous centroid for empty clusters. Tests: update two tests asserting old broken behaviour; add 24 new regression tests covering all previously-untested fix paths. Co-Authored-By: Oz <[email protected]>
EXPECT_THROW/EXPECT_NO_THROW discard the return value of getDistribution(), which is [[nodiscard]]. Cast to (void) inside each macro call to suppress -Wunused-result on CI. Co-Authored-By: Oz <[email protected]>
test_hmm_core.cpp: cast getDistribution() return to (void) inside EXPECT_THROW/EXPECT_NO_THROW to suppress -Wunused-result on the [[nodiscard]] attribute (GCC and Clang, both Linux runners). test_von_mises_distribution.cpp: add braces around EXPECT_NEAR inside bare if (p > 0.0) to suppress -Wdangling-else. EXPECT_NEAR expands to an if/else chain, making the else ambiguous without braces. Co-Authored-By: Oz <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements all 15 findings from the June 2026 independent code review of the v3.8.0 → v4.0.0 diff.
Findings addressed
Critical
updateCache(): Addedmutable std::mutex cache_mutex_with double-checked locking toFullCovarianceGaussianDistributionandDiagonalGaussianDistribution. Explicit copy/move operations defined (std::mutex is non-copyable/non-movable).BasicHmm<ObservationVectorView>::getDistribution(): Both overloads now throwstd::runtime_errorwith a descriptive message instead of unconditionally dereferencing null emission slots.High
from_jsonandfrom_json_mvnow validate each key name before consuming its value. Reordered JSON is rejected with a clear error.x <= −πtox < −πinwrap_anglesogetCumulativeProbability(−π)returns ~0.0 rather than ~1.0.setCovariance:cov_now stores the unregularised matrix;reg_*Iis applied only to a scratch copy for Cholesky factorisation.getCovariance()round-trips are idempotent.fit()sumW inflation:sumWis now accumulated inside the finite-data filter loop, not over all weights.pseudo_countpasses guard: Changedif (c < 0.0)toif (!(c >= 0.0))in both the constructor andsetPseudoCount.Medium
M < Kproduces duplicate centroids:seed_kmeansppnow throwsstd::invalid_argumentwhenpts.size() < K.fit(): Bothfit()overloads inFullCovarianceGaussianDistributionandDiagonalGaussianDistributionnow check per-observation dimensionality.convergenceWindow <= 1premature convergence:BasicViterbiTrainerconstructors andsetConfigthrow whenconvergenceWindow < 2.Low
StudentTDistribution::getCumulativeProbability(NaN): Returns 0.0, consistent withgetProbability(NaN).dimensionsnot cross-validated infrom_json_mv: The manifestdimensionsfield is now cross-checked against each distribution'sgetDimension().dimin distributionfrom_json:FullCovarianceGaussianDistribution::from_jsonandDiagonalGaussianDistribution::from_jsonnow bound dim to [1, 1024].apply_discrete_smoothingre-normalises after the smoothing pass.lloyd_update: Empty clusters now retain their previous centroid instead of collapsing to the zero vector.Tests
All 47 tests pass (0 failures).
Conversation: https://app.warp.dev/conversation/411a43b6-94f9-42bf-bc37-3987216f6798
Co-Authored-By: Oz [email protected]