fix(deringing): widen slope math to i32 to eliminate wrap risk

lilith · lilith · commit b3f5dda2d600 · 2026-04-11T20:28:43.000-06:00
preprocess_deringing computed fslope and lslope in bare i16 arithmetic: let mut fslope = (f1 - f2).max(MAX_SAMPLE - f1); let mut lslope = (l1 - l2).max(MAX_SAMPLE - l1); For the current in-contract use (level-shifted samples in -128..=127, with overshoot values up to MAX_SAMPLE + 31 = 158 written back into the block by a previous run), the largest magnitudes are well within i16 range: 158 - (-128) = 286 fits easily. But the subtractions are i16 - i16, so pathological callers — or a future switch to wider sample types — would wrap in release builds (panic in debug). That class of bug is exactly what the mozilla/mozjpeg#453 i16-SIMD-DCT overflow documented in CLAUDE.md warns about, except here in the scalar preprocessing step. Widen the subtractions to i32 and saturate back to i16 via clamp. The pattern matches what catmull_rom() already does at src/deringing.rs:121 for its tangent computation. For in-contract data the clamp is a no-op; for out-of-contract data it clamps instead of wrapping. zenjpeg sidesteps the whole issue by operating on f32 samples (zenjpeg/zenjpeg/src/encode/deringing.rs:113). Moving mozjpeg-rs to f32 would be the architecturally clean answer but would break the byte-parity property we depend on with C mozjpeg, so widen-to-i32 is the minimal-churn defensive fix. Verified bit-exact with C mozjpeg via tests/parity_benchmark after the change: Baseline Q55-Q95: 0.00% delta, 0.00% max dev (all 6 levels) Progressive Q55-Q95: 0.00% delta, 0.00% max dev (all 6 levels) Trellis modes: -0.05% to -0.80% (unchanged from pre-fix numbers) Max Compression: -0.72% to +0.21% (unchanged from pre-fix numbers) All 7 deringing unit tests pass.
diff --git a/src/deringing.rs b/src/deringing.rs
@@ -223,9 +223,18 @@ pub fn preprocess_deringing(data: &mut [i16; DCTSIZE2], dc_quant: u16) {
 
         // Calculate upward slopes at the edges
         // Use the steeper of: slope from two samples back, or slope to max
-        // This ensures we get an upward slope even if the edge is already clipped
-        let mut fslope = (f1 - f2).max(MAX_SAMPLE - f1);
-        let mut lslope = (l1 - l2).max(MAX_SAMPLE - l1);
+        // This ensures we get an upward slope even if the edge is already clipped.
+        //
+        // Widened to i32 so pathological inputs can't wrap: for in-contract
+        // level-shifted samples (-128..=127) and overshoot values up to 158,
+        // the largest magnitude is (158 - (-128)) = 286 which fits in i16,
+        // so the saturating cast is a no-op for real data but defends against
+        // future callers passing wider-range samples. Matches the i32 widening
+        // pattern already used in catmull_rom() above.
+        let fslope_i32 = (f1 as i32 - f2 as i32).max(MAX_SAMPLE as i32 - f1 as i32);
+        let lslope_i32 = (l1 as i32 - l2 as i32).max(MAX_SAMPLE as i32 - l1 as i32);
+        let mut fslope = fslope_i32.clamp(i16::MIN as i32, i16::MAX as i32) as i16;
+        let mut lslope = lslope_i32.clamp(i16::MIN as i32, i16::MAX as i32) as i16;
 
         // If at the start/end of the block, make the curve symmetric
         if start == 0 {