Skip to content

Commit e68d4ab

Browse files
committed
libretro-common/formats: harden rtga + rbmp against hostile image headers
Audit of the TGA (403 lines) and BMP (577 lines) image decoders. Both are stb_image-derived and both accept attacker-controlled bytes from files on disk, over HTTP (thumbnails, cores, assets bundles), or from user uploads. Nine fixes this round; one is a directly reachable stack buffer overflow. Nothing touches rjpeg.c (4808 lines) or rpng.c (1859 lines); those are their own rounds. === rtga.c === rtga: reject bogus palette_bits at the header (CRITICAL) A single attacker-controlled byte in the TGA header (tga_palette_bits, byte 7) drove a stack buffer overflow. The indexed-pixel read loop did: for (j = 0; j * 8 < tga_palette_bits; ++j) raw_data[j] = tga_palette[pal_idx + j]; where raw_data is declared as "unsigned char raw_data[4]" on the stack. With tga_palette_bits = 255 the loop writes raw_data[0..31] -- up to 28 bytes of stack corruption per pixel, driven directly by the header. Reachable on any ingest path that accepts TGA (core-provided asset bundles, RetroAchievement badges, user thumbnails, network mirrors). Fix: validate tga_palette_bits at the header-check point. TGA palettes legitimately hold only 15/16/24/32-bit entries; anything else is rejected. Also reject tga_palette_len < 1 (empty palette) which pre-patch malloc(0)'d and then read tga_palette[0] OOB. rtga: rtga_skip() clamp rtga_skip advanced img_buffer by n without checking against img_buffer_end. Pointer arithmetic past the end of the input array is undefined behaviour in C even without a deref. Clamp to the remaining input; subsequent rtga_get8 calls check "buffer < buffer_end" and parse as EOF. rtga: size_t output allocation + ceiling malloc((size_t)tga_width * tga_height * sizeof(uint32_t)) was OK on 64-bit but could wrap on 32-bit for the full 65535 x 65535 header-expressible range. Also, even when the arithmetic doesn't wrap, a legitimate-looking 65535 x 65535 ARGB decode would request ~17 GiB of memory from a single malicious 18-byte header. Fix: reject dimensions above 0x4000 x 0x4000 (16384 pixels each axis = 1 GiB decoded RGBA, larger than any realistic libretro asset). Do the malloc math in size_t. rtga: reject size > INT_MAX in rtga_process_image rtga_process_image cast "size" (size_t) to int before passing it to rtga_load_from_memory. A file >= 2 GiB truncated, propagated a negative len into img_buffer_end = buffer + len, and gave pointer arithmetic outside the source object. Reject early. rtga: size_t indexing in output writes output[dst_row * tga_width + cur_col] used signed int for the index. 65534 * 65535 overflows int32, UB. Use size_t. === rbmp.c === rbmp: reject img_y = 0x80000000 (abs(INT_MIN) UB) Pre-patch the code did flip_vertically = ((int) s->img_y) > 0; s->img_y = abs((int) s->img_y); When img_y == 0x80000000u, the cast to int is INT_MIN and abs(INT_MIN) is explicitly UB in C99. On GCC with 2's- complement it returns INT_MIN and the uint32_t reassignment leaves img_y == 0x80000000 -- a "negative" height that then drives the output allocation to a very large value. Fix: detect this specific value before the abs() call and bail cleanly. rbmp: size_t output allocation + ceiling malloc(s->img_x * s->img_y * sizeof(uint32_t)) did the multiplication in uint32_t (both operands are uint32_t), so e.g. 0x10001 x 0x10000 silently wrapped to 0x10000, producing a 256 KiB buffer for 16 GiB of claimed image data. The subsequent per-row pixel writes ran off the end for gigabytes -- heap corruption driven by a 14-byte BMP header. Fix: same ceiling as rtga (0x4000 x 0x4000) and size_t math for the malloc. rbmp: rbmp_skip() clamp Same as rtga_skip. rbmp: size_t indexing in output writes output + dst_row * s->img_x used signed int * uint32 (promoted to uint32), which wraps. Use size_t. === Regression tests (two new) === libretro-common/samples/formats/tga/rtga_test.c 1. test_malicious_palette_bits (palette_bits=255) -- header check must reject. 2. test_empty_indexed_palette -- TRUE DISCRIMINATOR under ASan. Pre-patch the unpatched decoder fires: AddressSanitizer: heap-buffer-overflow READ of size 1 at rtga.c:286 in rtga_tga_load (1-byte region allocated at rtga.c:234) Post-patch the decoder returns IMAGE_PROCESS_ERROR at the header check. 3. test_happy_path_rgb -- 2x2 uncompressed BGR smoke. 4. test_oversized_size -- size > INT_MAX rejection. libretro-common/samples/formats/bmp/rbmp_test.c 1. test_img_y_int_min -- 0x80000000 height, post-patch rejection. Pre-patch would invoke UB via abs(INT_MIN) and the downstream allocation decisions are unpredictable. 2. test_dimension_overflow -- 0x10001 x 0x10000, post- patch rejection. The pre-patch 32-bit wrap is not reproducible on a 64-bit CI host but the rejection IS reproducible everywhere and prevents the 17 GiB allocation attempt that would otherwise OOM the test. 3. test_happy_1x1_24bpp -- happy-path smoke. Builds clean under -Wall -Werror -std=gnu99. ASan -O0 clean on patched; the header-check rejections fire before any memory is touched, so ASan never sees anything. CI: both new tests added to RUN_TARGETS. Full samples workflow dry-run: Built: 17 Ran: 17 Failed: 0 VC6 compatibility: both files are compiled on all platforms. The fix uses only size_t / uint32_t / int and the existing stdint shim; <limits.h> (C89) is explicitly included where needed. No new dependencies. Reachability summary: * Stack BOF via TGA palette_bits: HIGH severity, single header byte, any TGA ingest path. * Heap BOF via BMP dimension wrap: MEDIUM severity on 32-bit, reaches malloc-failure on 64-bit. * Other fixes: UB hardening + defense-in-depth.
1 parent 1c04787 commit e68d4ab

7 files changed

Lines changed: 583 additions & 15 deletions

File tree

.github/workflows/Linux-libretro-common-samples.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ jobs:
5656
cdrom_cuesheet_overflow_test
5757
http_parse_test
5858
rjson_test
59+
rtga_test
60+
rbmp_test
5961
)
6062
6163
# Per-binary run command (overrides ./<binary> if present).

libretro-common/formats/bmp/rbmp.c

Lines changed: 43 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include <stddef.h> /* ptrdiff_t on osx */
2929
#include <stdlib.h>
3030
#include <string.h>
31+
#include <limits.h> /* INT_MAX */
3132

3233
#include <retro_inline.h>
3334

@@ -66,13 +67,20 @@ static INLINE unsigned char rbmp_get8(rbmp_context *s)
6667

6768
static void rbmp_skip(rbmp_context *s, int n)
6869
{
70+
ptrdiff_t remaining;
6971
if (n < 0)
7072
{
7173
s->img_buffer = s->img_buffer_end;
7274
return;
7375
}
74-
75-
s->img_buffer += n;
76+
/* Clamp to remaining input to avoid pointer arithmetic beyond
77+
* img_buffer_end (UB per C99). Subsequent rbmp_get8 calls
78+
* check "buffer < buffer_end" and parse as EOF. */
79+
remaining = s->img_buffer_end - s->img_buffer;
80+
if ((ptrdiff_t)n > remaining)
81+
s->img_buffer = s->img_buffer_end;
82+
else
83+
s->img_buffer += n;
7684
}
7785

7886
static int rbmp_get16le(rbmp_context *s)
@@ -193,8 +201,16 @@ static uint32_t *rbmp_bmp_load(rbmp_context *s, unsigned *x, unsigned *y,
193201
if (bpp == 1)
194202
return 0;
195203

204+
/* img_y can legitimately be a negative int (top-down BMP).
205+
* Pre-patch the code did abs((int)s->img_y) -- if the uint32
206+
* happened to be 0x80000000, the cast to int is INT_MIN and
207+
* abs(INT_MIN) is undefined behaviour. Detect that case and
208+
* treat as bottom-up zero-height (rejected by the overflow
209+
* guard below). */
210+
if (s->img_y == 0x80000000u)
211+
return 0;
196212
flip_vertically = ((int) s->img_y) > 0;
197-
s->img_y = abs((int) s->img_y);
213+
s->img_y = (uint32_t)abs((int) s->img_y);
198214

199215
if (hsz == 12)
200216
{
@@ -298,8 +314,25 @@ static uint32_t *rbmp_bmp_load(rbmp_context *s, unsigned *x, unsigned *y,
298314
}
299315
s->img_n = ma ? 4 : 3;
300316

301-
/* Always output as uint32 (4 bytes per pixel) */
302-
output = (uint32_t*)malloc(s->img_x * s->img_y * sizeof(uint32_t));
317+
/* Always output as uint32 (4 bytes per pixel).
318+
*
319+
* Pre-patch this multiplied two uint32_t dimensions at uint32_t
320+
* width, so img_x * img_y silently wrapped on 32-bit -- e.g.
321+
* 0x10001 * 0x10000 = 0x100010000 wraps to 0x10000, and the
322+
* subsequent malloc returned a 256 KiB buffer that the pixel
323+
* decode loop wrote off the end of (4 GiB+ of writes). Do
324+
* the multiplication in size_t with an explicit ceiling:
325+
* 0x4000 x 0x4000 = 256 M pixels = 1 GiB of decoded RGBA is
326+
* far beyond any realistic libretro asset, and rejecting at
327+
* that ceiling blocks the overflow case AND avoids giving a
328+
* malicious BMP a direct route to a multi-GiB allocation
329+
* attempt. */
330+
if (s->img_x == 0 || s->img_y == 0)
331+
return 0;
332+
if (s->img_x > 0x4000u || s->img_y > 0x4000u)
333+
return 0;
334+
output = (uint32_t*)malloc(
335+
(size_t)s->img_x * (size_t)s->img_y * sizeof(uint32_t));
303336
if (!output)
304337
return 0;
305338

@@ -342,7 +375,11 @@ static uint32_t *rbmp_bmp_load(rbmp_context *s, unsigned *x, unsigned *y,
342375
for (j = 0; j < (int)s->img_y; ++j)
343376
{
344377
int dst_row = flip_vertically ? (int)(s->img_y - 1 - j) : j;
345-
uint32_t *dst = output + dst_row * s->img_x;
378+
/* Use size_t for the row-stride offset. dst_row *
379+
* s->img_x in signed-int could overflow for a legitimate
380+
* 2 GiB BMP (e.g. 46341 x 46341). Post-patch the
381+
* pointer math matches the size_t-based allocation. */
382+
uint32_t *dst = output + (size_t)dst_row * (size_t)s->img_x;
346383
int col = 0;
347384

348385
for (i = 0; i < (int)s->img_x; i += 2)

libretro-common/formats/tga/rtga.c

Lines changed: 68 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include <stddef.h> /* ptrdiff_t on osx */
2929
#include <stdlib.h>
3030
#include <string.h>
31+
#include <limits.h> /* INT_MAX, SIZE_MAX via stdint */
3132

3233
#include <retro_inline.h>
3334

@@ -60,12 +61,24 @@ static INLINE uint8_t rtga_get8(rtga_context *s)
6061

6162
static void rtga_skip(rtga_context *s, int n)
6263
{
64+
ptrdiff_t remaining;
6365
if (n < 0)
6466
{
6567
s->img_buffer = s->img_buffer_end;
6668
return;
6769
}
68-
s->img_buffer += n;
70+
/* Clamp the advance to the remaining input. Pre-patch a large
71+
* attacker-supplied offset (TGA header byte 1, or palette
72+
* start) pushed img_buffer past img_buffer_end, which is
73+
* pointer arithmetic outside the allocated object (UB per C99).
74+
* All callers of rtga_get8 check "buffer < buffer_end" so the
75+
* clamped state parses as EOF and the subsequent header checks
76+
* or indexed-palette code fail cleanly. */
77+
remaining = s->img_buffer_end - s->img_buffer;
78+
if ((ptrdiff_t)n > remaining)
79+
s->img_buffer = s->img_buffer_end;
80+
else
81+
s->img_buffer += n;
6982
}
7083

7184
static int rtga_get16le(rtga_context *s)
@@ -121,17 +134,48 @@ static uint32_t *rtga_tga_load(rtga_context *s,
121134
)
122135
return NULL;
123136

124-
/* If paletted, then we will use the number of bits from the palette */
137+
/* If paletted, then we will use the number of bits from the palette.
138+
*
139+
* tga_palette_bits is attacker-controlled (TGA byte 7, 0..255).
140+
* Pre-patch the indexed-read loop below did
141+
* for (j = 0; j * 8 < tga_palette_bits; ++j)
142+
* raw_data[j] = tga_palette[pal_idx + j];
143+
* raw_data is a 4-byte stack array, so tga_palette_bits > 32
144+
* wrote past the end of raw_data -- a stack buffer overflow of
145+
* up to 28 bytes, directly driven by the TGA header. Reject
146+
* bogus palette_bits / empty palettes here and everything
147+
* downstream runs with bounded buffers. */
125148
if (tga_indexed)
149+
{
150+
if ( tga_palette_len < 1
151+
|| ( tga_palette_bits != 15
152+
&& tga_palette_bits != 16
153+
&& tga_palette_bits != 24
154+
&& tga_palette_bits != 32))
155+
return NULL;
126156
tga_comp = tga_palette_bits / 8;
157+
}
127158

128159
/* TGA info */
129160
*x = tga_width;
130161
*y = tga_height;
131162
if (comp)
132163
*comp = tga_comp;
133164

134-
output = (uint32_t*)malloc((size_t)tga_width * tga_height * sizeof(uint32_t));
165+
/* Bound the output allocation. TGA dimensions are attacker-
166+
* controlled 16-bit values (max 65535), so their product is
167+
* up to ~4.29 G pixels. Pre-patch a 32-bit build could wrap
168+
* (size_t)w * h * sizeof(uint32_t) to a small positive size_t
169+
* and the per-pixel decode ran off the undersized malloc.
170+
* Reject dimensions beyond a sane ceiling so the allocation
171+
* never grows anywhere near wrap territory and hostile
172+
* headers cannot drive the client into a multi-GiB malloc
173+
* attempt. 0x4000 x 0x4000 = 1 GiB of decoded RGBA,
174+
* comfortably larger than any real-world libretro asset. */
175+
if (tga_width > 0x4000 || tga_height > 0x4000)
176+
return NULL;
177+
output = (uint32_t*)malloc(
178+
(size_t)tga_width * (size_t)tga_height * sizeof(uint32_t));
135179
if (!output)
136180
return NULL;
137181

@@ -226,19 +270,23 @@ static uint32_t *rtga_tga_load(rtga_context *s,
226270
int cur_col = 0;
227271
int cur_row = 0;
228272

229-
/* Load palette if indexed */
273+
/* Load palette if indexed. Header-level checks above have
274+
* ensured tga_palette_len >= 1 and tga_palette_bits in
275+
* {15,16,24,32}, so n is positive and bounded at
276+
* 65535 * 32 / 8 = 262140 bytes -- fits comfortably in int
277+
* and in the input length we've already accepted. */
230278
if (tga_indexed)
231279
{
232280
int n;
233281
rtga_skip(s, tga_palette_start);
234-
tga_palette = (unsigned char*)malloc(tga_palette_len * tga_palette_bits / 8);
282+
n = tga_palette_len * tga_palette_bits / 8;
283+
tga_palette = (unsigned char*)malloc((size_t)n);
235284
if (!tga_palette)
236285
{
237286
free(output);
238287
return NULL;
239288
}
240-
n = tga_palette_len * tga_palette_bits / 8;
241-
if (s->img_buffer + n <= s->img_buffer_end)
289+
if (s->img_buffer_end - s->img_buffer >= (ptrdiff_t)n)
242290
{
243291
memcpy(tga_palette, s->img_buffer, n);
244292
s->img_buffer += n;
@@ -325,9 +373,11 @@ static uint32_t *rtga_tga_load(rtga_context *s,
325373
| ((uint32_t)g << 8) | (uint32_t)b;
326374

327375
/* Write to correct position using tracked row/col
328-
* (avoids per-pixel division and modulo) */
376+
* (avoids per-pixel division and modulo). Use size_t for
377+
* the index so dst_row * tga_width does not overflow
378+
* signed int for a legitimate 65535 x 65535 image. */
329379
dst_row = tga_inverted ? (tga_height - 1 - cur_row) : cur_row;
330-
output[dst_row * tga_width + cur_col] = pixel;
380+
output[(size_t)dst_row * (size_t)tga_width + (size_t)cur_col] = pixel;
331381

332382
if (++cur_col >= tga_width)
333383
{
@@ -366,6 +416,15 @@ int rtga_process_image(rtga_t *rtga, void **buf_data,
366416
if (!rtga)
367417
return IMAGE_PROCESS_ERROR;
368418

419+
/* Reject sizes that don't fit in int before casting. A TGA
420+
* file larger than INT_MAX handed to an int-taking API would
421+
* truncate, the truncated value (potentially negative)
422+
* propagated to img_buffer_end = buffer + len, producing
423+
* pointer arithmetic outside the source object (UB). A 2 GiB
424+
* TGA is unreasonable; reject early. */
425+
if (size > (size_t)INT_MAX)
426+
return IMAGE_PROCESS_ERROR;
427+
369428
rtga->output_image = rtga_load_from_memory(rtga->buff_data,
370429
(int)size, width, height, &comp, supports_rgba);
371430
*buf_data = rtga->output_image;
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
TARGET := rbmp_test
2+
3+
LIBRETRO_COMM_DIR := ../../..
4+
5+
SOURCES := \
6+
rbmp_test.c \
7+
$(LIBRETRO_COMM_DIR)/formats/bmp/rbmp.c
8+
9+
OBJS := $(SOURCES:.c=.o)
10+
11+
CFLAGS += -Wall -pedantic -std=gnu99 -g -I$(LIBRETRO_COMM_DIR)/include
12+
13+
all: $(TARGET)
14+
15+
%.o: %.c
16+
$(CC) -c -o $@ $< $(CFLAGS)
17+
18+
$(TARGET): $(OBJS)
19+
$(CC) -o $@ $^ $(LDFLAGS)
20+
21+
clean:
22+
rm -f $(TARGET) $(OBJS)
23+
24+
.PHONY: clean

0 commit comments

Comments
 (0)