|
| 1 | +# DCT Overflow Test Patterns |
| 2 | + |
| 3 | +Synthetic images that trigger mozilla/mozjpeg#453: 16-bit SIMD forward DCT overflow |
| 4 | +when overshoot deringing is enabled. |
| 5 | + |
| 6 | +## The Bug |
| 7 | + |
| 8 | +Overshoot deringing pushes level-shifted sample values to ±158 (vs normal ±128). |
| 9 | +The SIMD ISLOW forward DCT uses 16-bit packed arithmetic. After the row pass produces |
| 10 | +intermediate values up to ±5056, the column pass final butterfly sums 8 identical |
| 11 | +row outputs: `8 × 5056 = 40,448`, exceeding the signed 16-bit maximum of 32,767. |
| 12 | + |
| 13 | +The wrapping causes catastrophic sign flips — entire 8×8 blocks have their brightness |
| 14 | +inverted. |
| 15 | + |
| 16 | +**Fix:** Use saturating add/sub (`paddsw`/`psubsw`) instead of wrapping (`paddw`/`psubw`) |
| 17 | +in the final even-part butterfly of the column pass. |
| 18 | + |
| 19 | +## Files |
| 20 | + |
| 21 | +| File | Size | Triggers overflow? | Notes | |
| 22 | +|------|------|--------------------|-------| |
| 23 | +| `left_black_right_white.png` | 64×64 | Yes | Vertical split per 8×8 block | |
| 24 | +| `left_white_right_black.png` | 64×64 | Yes | Inverted vertical split | |
| 25 | +| `single_8x8_half.png` | 8×8 | Yes | Minimal reproducer | |
| 26 | +| `top_black_bottom_white.png` | 64×64 | No | Horizontal split (row pass sees uniform rows) | |
| 27 | +| `checkerboard_8x8.png` | 64×64 | No | Full black/white blocks (no intra-block edge) | |
| 28 | + |
| 29 | +## Why Only Vertical Splits Trigger It |
| 30 | + |
| 31 | +The DCT processes rows first, then columns. A vertical split within an 8×8 block |
| 32 | +means each row sees `[0,0,0,0,255,255,255,255]` (or inverted), producing maximum |
| 33 | +AC energy and intermediate values of ±5056 in the row pass. The column pass then |
| 34 | +sums 8 identical row results. |
| 35 | + |
| 36 | +A horizontal split means each row is either all-black or all-white (DC-only), |
| 37 | +producing zero AC energy. The column pass intermediates stay small. |
| 38 | + |
| 39 | +## Reference |
| 40 | + |
| 41 | +- https://github.com/mozilla/mozjpeg/pull/453 |
| 42 | +- Affects: libjpeg-turbo ISLOW FDCT on all SIMD architectures (SSE2, AVX2, NEON, etc.) |
| 43 | +- Quality range: Q1–Q57 (DC quantization value ≥ 14) |
0 commit comments