Skip to content

feat: v3.8.0 — sample(), StudentT CDF fix, fitting validation tests#23

Merged
OldCrow merged 5 commits into
mainfrom
feature/v3.8.0
Jun 7, 2026
Merged

feat: v3.8.0 — sample(), StudentT CDF fix, fitting validation tests#23
OldCrow merged 5 commits into
mainfrom
feature/v3.8.0

Conversation

@OldCrow

@OldCrow OldCrow commented Jun 7, 2026

Copy link
Copy Markdown
Owner

v3.8.0 — Final v3.x release

Closes out the three open issues on the v3.x line before v4.0.0 work begins.

Changes

Fixes #16StudentTDistribution::getCumulativeProbability() wrong for 1 < ν < 30

  • Replaced incorrect rational approximation with the exact regularised incomplete beta formula valid for all ν > 0: P(T ≤ t; ν) = 1 − ½·I_{ν/(ν+t²)}(ν/2, ½)
  • Promoted BetaDistribution::incompleteBeta to DistributionBase::incompleteBeta (protected static), alongside gammap
  • Added CDF accuracy tests: Cauchy exact (arctan), ν=2 closed-form, ν=5/10 approximate, symmetry checks
  • Added math helper accuracy tests for gammap (Gamma/ChiSquared CDFs) and incompleteBeta (polynomial Beta CDFs)

Closes #22sample(std::mt19937_64& rng) pure virtual on EmissionDistribution, implemented for all 16 distributions

  • stdlib <random> adapters for most distributions
  • NegativeBinomial: Gamma-Poisson mixture (supports real-valued r)
  • Beta: Gamma-ratio method
  • Pareto, Rayleigh: inverse-CDF
  • VonMises: Best (1979) wrapped-Cauchy rejection sampler (exact, ~1.06–1.32 iterations expected)
  • 3 new tests: finite values for all 16, discrete returns non-negative integers, mean convergence for Gaussian/Exponential/Poisson/Uniform/Beta/VonMises

Closes #21 — Distribution fitting validation tests

  • New tests/distributions/test_distribution_fitting.cpp (17 test cases)
  • Unweighted tests use data with analytically exact sample statistics (tolerances documented in comments)
  • Weighted tests verify uniform weights ≡ unweighted, and that concentrated weights steer parameters
  • Edge cases: single-point data and near-zero weights do not crash

v3.8.0 release

  • Version bump: 3.7.0 → 3.8.0
  • CHANGELOG entry
  • README: version/test badges updated; v4.0.0 roadmap notice added

Test results

42/42 tests pass (was 41/41 in v3.7.0).


v4.0.0 development is on feature/v4-multivariate-emissions and raises the minimum platform floor to macOS 13+, GCC 12+, Apple Clang 14+, MSVC 2022 17.x for full C++20 and multivariate emission distributions.

OldCrow and others added 5 commits June 6, 2026 19:32
…cy tests

Fixes #16: StudentTDistribution::getCumulativeProbability() used a bogus
rational approximation for moderate nu that produced errors up to 0.06.

Changes:
- Promote BetaDistribution::incompleteBeta() to
  DistributionBase::incompleteBeta() (protected static), alongside gammap.
  Continued-fraction algorithm with symmetry swap; kTol=1e-12 convergence.
- Remove private BetaDistribution::incompleteBeta(); getCumulativeProbability()
  now calls the base-class static directly.
- Rewrite StudentTDistribution::getCumulativeProbability() with the exact
  formula for all nu > 0:
    P(T <= t; nu) = 1 - 0.5 * I_{nu/(nu+t^2)}(nu/2, 1/2)  for t > 0
  No special cases for moderate vs large nu; the incomplete beta is exact.

New accuracy tests (verify the math primitives directly):
- StudentTDistributionTest::CDFAccuracy: Cauchy exact (arctan), nu=2
  closed-form, nu=5/10 approximate, symmetry, boundaries
- GammaDistributionTest::CDFAccuracy: exercises gammap (and gcf/gser) via
  Poisson-series exact CDFs: k=1 (Exp), k=2, k=3, scale parameter
- ChiSquaredDistributionTest::CDFAccuracy: second gammap code path; df=2/4
  exact, df=1 via std::erf (half-integer case), boundaries
- BetaDistributionTest::CDFAccuracy: exercises incompleteBeta via polynomial
  exact CDFs: Beta(1,1), (2,1), (1,2), (2,2), (3,2), symmetry, boundaries

Note: errorf_inv in DistributionBase has no call sites and is dead code.

All 41 tests pass.

Co-Authored-By: Oz <[email protected]>
Closes #22.

Interface change:
- EmissionDistribution: add pure virtual
    [[nodiscard]] virtual double sample(std::mt19937_64& rng) const = 0
  includes <random> in the header; all existing callers compile unchanged
  since they do not call sample()

Implementations (16 distributions):
  GaussianDistribution:         std::normal_distribution
  DiscreteDistribution:         std::discrete_distribution (returns integer as double)
  ExponentialDistribution:      std::exponential_distribution
  GammaDistribution:            std::gamma_distribution
  PoissonDistribution:          std::poisson_distribution (integer -> double)
  BinomialDistribution:         std::binomial_distribution (integer -> double)
  NegativeBinomialDistribution: Gamma-Poisson mixture (supports real-valued r)
  ChiSquaredDistribution:       std::chi_squared_distribution
  StudentTDistribution:         std::student_t_distribution + location/scale shift
  LogNormalDistribution:        std::lognormal_distribution
  WeibullDistribution:          std::weibull_distribution
  BetaDistribution:             Gamma-ratio method (X/(X+Y), X,Y~Gamma)
  UniformDistribution:          std::uniform_real_distribution
  ParetoDistribution:           inverse-CDF: xm * U^(-1/k)
  RayleighDistribution:         inverse-CDF: sigma * sqrt(-2 * log(U))
  VonMisesDistribution:         Best (1979) rejection sampler

Tests (3 new, added to CommonDistributionTest fixture):
  SampleProducesFiniteValues:   100 draws per distribution, all must be finite
  SampleDiscreteReturnsIntegers: discrete distributions return non-negative integers
  SampleMeanConverges:          2000 samples, Gaussian/Exp/Poisson/Uniform/Beta/
                                 VonMises means within 4-sigma of analytic values

All 41 tests pass.

Co-Authored-By: Oz <[email protected]>
Closes #21.

New file: tests/distributions/test_distribution_fitting.cpp

Tests 5 distributions (Gaussian, Exponential, Poisson, Gamma, Discrete)
with 17 test cases across 5 suites:

Approach:
- Unweighted tests use data whose exact sample statistics are known from
  simple arithmetic so tolerances are tight (1e-9 for closed-form MLE,
  ±0.15 for Newton-Raphson Gamma).
- Weighted-uniform tests verify that fit(data, uniform_weights) == fit(data).
- Weighted-concentrated tests verify that non-uniform weights genuinely
  steer fitted parameters.
- Edge-case tests verify single-point data and near-zero weights do not crash.

Tolerance rationale documented in comments per distribution.

42/42 tests pass.

Co-Authored-By: Oz <[email protected]>
- CMakeLists.txt: VERSION 3.7.0 -> 3.8.0
- CHANGELOG.md: add [3.8.0] entry documenting #16, #21, #22
- README.md: update version/test badges; add v4.0.0 roadmap notice

42/42 tests pass.

Co-Authored-By: Oz <[email protected]>
All 40 files reformatted by clang-format 19.1.7 (--style=file).
No logic changes. Pre-commit hooks now pass on all platforms.

Co-Authored-By: Oz <[email protected]>
@OldCrow OldCrow merged commit 216a1bd into main Jun 7, 2026
7 checks passed
@OldCrow OldCrow deleted the feature/v3.8.0 branch June 7, 2026 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant