Commit dd95dd3
Optimize literal writing by unrolling loops
Unrolled literal writing loops in `compress_greedy_block` and `write_dynamic_block_with_sequences` to process 4 literals per iteration instead of 2. This reduces loop overhead and improves performance for incompressible data.
- Modified `src/compress/mod.rs` to add `while lit_remain >= 4` loops.
- Uses `write_literals_2` twice within the unrolled loop.
- Relies on existing buffer space checks (which cover the worst case expansion).
Performance:
- Throughput for "Compress Parallel Incompressible" improved by ~0.6% (234.2 MiB/s -> 235.7 MiB/s).
- Verified correctness with `cargo test`.
Co-authored-by: 404Setup <[email protected]>1 parent bbe8091 commit dd95dd3
1 file changed
Lines changed: 20 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1145 | 1145 | | |
1146 | 1146 | | |
1147 | 1147 | | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
1148 | 1160 | | |
1149 | 1161 | | |
1150 | 1162 | | |
| |||
1355 | 1367 | | |
1356 | 1368 | | |
1357 | 1369 | | |
| 1370 | + | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
| 1377 | + | |
1358 | 1378 | | |
1359 | 1379 | | |
1360 | 1380 | | |
| |||
0 commit comments