Skip to content

Commit 5747b0c

Browse files
committed
drm/sunxi: fix RGUI menu color conversion and per-frame leak
The RGUI menu blit path in drm_gfx.c and sunxi_gfx.c (audit findings #29 and #34, byte-for-byte clones of each other) had several bugs that compounded into wrong menu colors, an unbounded VLA, and -- on drm_gfx -- a malloc per menu frame with no matching free. Bugs in the bit extraction ========================== The conversion treated the uint16_t source as if its layout were RGB565 with naive 8-bit-channel shifts: R = (src_pix << 8) & 0x00FF0000; G = (src_pix << 4) & 0x0000FF00; B = (src_pix << 0) & 0x000000FF; line[j] = R | G | B; Two problems with that: 1. The RGUI default output on these platforms is RGBA4444, not RGB565 (rgui_set_pixel_format_function in menu/drivers/rgui.c selects RGBA4444 for the unmatched-driver-name fallback that covers both drm and sunxi). The channel layout is R=15..12, G=11..8, B=7..4, A=3..0. 2. Even taken as RGB565, the shifts are wrong: they place 4-bit channel data into the low nibble of each 8-bit destination byte, so e.g. 0x000F (RGBA black, fully opaque) decodes to 0x0000000F -- alpha bleeds into the blue byte. 0x00FF (RGBA pure blue) decodes to 0x00000FFF, leaking alpha across the bottom of green into blue. Concretely, with the old code: rgba4444 input | old output | correct XRGB8888 ---------------+-------------+----------------- 0xF00F (red) | 0x00F0000F | 0x00FF0000 0x0F0F (green) | 0x000FF00F | 0x0000FF00 0x00FF (blue) | 0x00000FFF | 0x000000FF 0x000F (black) | 0x0000000F | 0x00000000 0x888F (gray) | 0x0088888F | 0x00888888 0xFFFF (white) | 0x00FFFFFF | 0x00FFFFFF (only correct case) Pure white was the one input that came out right, which is likely why the bug survived: a quick visual sanity check on fully-saturated menu chrome could look plausible while every other color was off and had stray bits in the unused alpha byte of XRGB8888. Fix: treat src_pix as RGBA4444, extract each 4-bit channel, and expand to 8 bits via nibble replication (n | (n << 4)), which is the standard 4-bit-to-8-bit upscaling that maps 0 to 0x00 and 0xF to 0xFF linearly. The (currently ignored) rgb32 parameter ======================================= Both functions accepted a `bool rgb32` from the caller and discarded it. The set_texture_frame contract is: when rgb32 is true, the source is XRGB8888 already and no conversion is required. The new code branches on rgb32 and does a per-row memcpy in that case. RGUI does not currently exercise this path on these drivers, but other set_texture_frame callers do, and the parameter is part of the interface. Per-frame malloc leak (drm only) ================================ drm_set_texture_frame did: char *frame_output = malloc(dst_pitch * height); /* ... fill it ... */ drm_surface_update(_drmvars, frame_output, _drmvars->menu_surface); drm_surface_update memcpys row-by-row from frame_output into the dumb buffer at surface->pages[surface->flip_page].buf.map (drm_gfx.c:328-331) and returns. frame_output is never freed -- one full XRGB8888-frame leak per menu refresh. The intermediate buffer is unnecessary. The destination is a CPU-mapped DRM dumb buffer; we can write the converted pixels directly into it. surface->pitch is the visible bytes per row (src_width * bpp, set at drm_gfx.c:243) and matches the dst stride for the menu surface, which was created with pitch == width * 4 == surface->pitch. After writing in place, call drm_page_flip directly -- this is the same sequence drm_surface_update performs minus the now-redundant memcpy. Stack-allocated VLA =================== Both files declared uint32_t line[dst_width]; as a per-row scratch buffer, where dst_width was the menu output width (drm) or the display xres (sunxi). On a 4K-wide output that's 16 KiB on the stack per call, on platforms (embedded ARM Linux) where thread stacks may be small. The direct-into-mapped-buffer rewrite removes the need for any scratch line entirely; pixels are produced one at a time into their final destination slot. sunxi: stale-stack read on width < xres ======================================= sunxi_set_texture_frame additionally had a smaller bug: the inner loop only filled line[0..width-1], but the row memcpy copied dst_pitch == xres*4 bytes from line[]. When the source was narrower than the display (e.g. RGUI rendering at a lower resolution upscaled by the layer), the copy read uninitialized stack bytes for pixels [width..xres-1] and pushed them into the framebuffer. With the new direct-write path, only the `width` actually-meaningful pixels per row are written; pixels beyond `width` retain whatever was previously in the dumb buffer (typically the prior frame's same-row content, which is the correct behavior for a partial-width menu blit). Defensive clamps on width/height ================================ The destination buffers (drm dumb buffer, sunxi framebuffer page) are sized once and never resized while the menu is active. The original drm code was structured around an intermediate malloc whose size was driven by the surface's src_height and total_pitch, which silently truncated rows if the caller passed a larger frame than the surface was created for. The original sunxi code allocated nothing and would have walked off the end of pages[0] in the same scenario. To preserve the no-overrun property of the new direct-write path, both functions now clamp `width` and `height` against the destination's known bounds (drm: surface->src_width / surface->src_height; sunxi: sunxi_disp->xres / src_height) before the loops. In normal operation these clamps are no-ops; they exist purely so a future caller passing oversized dimensions writes a truncated frame rather than overrunning the mapped region. Validation ========== - Both functions compile cleanly under gcc -Wall -Wextra in isolation (no unused vars, no unused parameters beyond the pre-existing `alpha`). - Conversion math validated against a tabulated reference for the six characteristic RGBA4444 inputs above, including the diagnostic that pure-white is the unique input where the old and new implementations agree. - drm path: surface->pitch and surface->total_pitch semantics verified against drm_surface_setup (drm_gfx.c:222-278) and drm_surface_update (316-338). For the menu surface, both are width*4 because drm_surface_setup is called with pitch=width*4, bpp=4, src_width=width. - drm flip semantics verified: drm_page_flip uses pages[flip_page] then toggles flip_page, so writing into pages[flip_page].buf.map before calling drm_page_flip puts the new pixels on the page that is about to be committed, matching what drm_surface_update did. - sunxi flip semantics unchanged: same sunxi_layer_set_rgb_input_buffer call follows the in-place write, just as before. - Behavioral validation via a synthetic harness that mocks the DRM dumb-buffer and sunxi page interfaces, drops in both the pre- and post-patch versions of each function side-by-side, and runs them against identical inputs (41 assertions across 12 cases, including: exhaustive 4096-combination nibble matrix, 1000-frame leak counter, red-zone canaries to detect oversized-frame overruns, and direct reproduction of the original alpha-bleed and stack-garbage bugs against the unfixed code). All assertions pass on the post-patch code; the original-code reproductions all fail in the expected ways. - Not testable in this sandbox: actual on-device rendering on Raspberry Pi (drm) or Allwinner SoC (sunxi). Both drivers are conditionally compiled and have no automated tests.
1 parent 031fd9d commit 5747b0c

2 files changed

Lines changed: 107 additions & 49 deletions

File tree

gfx/drivers/drm_gfx.c

Lines changed: 53 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -816,8 +816,11 @@ static void drm_set_texture_enable(void *data, bool state, bool full_screen)
816816
static void drm_set_texture_frame(void *data, const void *frame, bool rgb32,
817817
unsigned width, unsigned height, float alpha)
818818
{
819-
unsigned int i, j;
820-
struct drm_video *_drmvars = data;
819+
unsigned int i;
820+
struct drm_video *_drmvars = data;
821+
struct drm_surface *surface = NULL;
822+
uint8_t *dst_base = NULL;
823+
unsigned int dst_pitch;
821824

822825
if (!_drmvars->menu_active)
823826
return;
@@ -843,36 +846,61 @@ static void drm_set_texture_frame(void *data, const void *frame, bool rgb32,
843846
drm_plane_setup(_drmvars->menu_surface);
844847
}
845848

846-
/* We have to go on a pixel format conversion adventure
847-
* for now, until we can convince RGUI to output
848-
* in an 8888 format. */
849-
unsigned int src_pitch = width * 2;
850-
unsigned int dst_pitch = width * 4;
851-
unsigned int dst_width = width;
852-
uint32_t line[dst_width];
853-
854-
/* The output pixel array with the converted pixels. */
855-
char *frame_output = (char *) malloc (dst_pitch * height);
849+
surface = _drmvars->menu_surface;
850+
dst_base = (uint8_t*)surface->pages[surface->flip_page].buf.map;
851+
dst_pitch = surface->pitch;
856852

857-
/* Remember, memcpy() works with 8bits pointers for increments. */
858-
char *dst_base_addr = frame_output;
853+
/* Defensive clamps: the dumb buffer was sized at the first
854+
* menu frame's dimensions and is never resized within an
855+
* active menu session. If the caller hands us something
856+
* bigger anyway, write only what fits rather than running
857+
* off the end of the mapped region. */
858+
{
859+
unsigned int max_w = (unsigned int)surface->src_width;
860+
unsigned int max_h = (unsigned int)surface->src_height;
861+
if (width > max_w) width = max_w;
862+
if (height > max_h) height = max_h;
863+
}
859864

860-
for (i = 0; i < height; i++)
865+
if (rgb32)
861866
{
862-
for (j = 0; j < src_pitch / 2; j++)
867+
/* Source is already XRGB8888 -- just copy row by row to handle
868+
* any difference between source stride and dst stride. */
869+
const uint8_t *src = (const uint8_t*)frame;
870+
unsigned int src_pitch = width * 4;
871+
unsigned int row_bytes = (src_pitch < dst_pitch) ? src_pitch : dst_pitch;
872+
873+
for (i = 0; i < height; i++)
874+
memcpy(dst_base + (dst_pitch * i), src + (src_pitch * i), row_bytes);
875+
}
876+
else
877+
{
878+
/* RGUI default output is RGBA4444 with channel layout
879+
* R = bits 15..12, G = 11..8, B = 7..4, A = 3..0
880+
* Expand each 4-bit channel to 8 bits via nibble replication
881+
* (x | (x << 4)) and pack into XRGB8888 for the dumb buffer. */
882+
for (i = 0; i < height; i++)
863883
{
864-
uint16_t src_pix = *((uint16_t*)frame + (src_pitch / 2 * i) + j);
865-
/* The hex AND is for keeping only the part we need for each component. */
866-
uint32_t R = (src_pix << 8) & 0x00FF0000;
867-
uint32_t G = (src_pix << 4) & 0x0000FF00;
868-
uint32_t B = (src_pix << 0) & 0x000000FF;
869-
line[j] = (0 | R | G | B);
884+
const uint16_t *src_row = (const uint16_t*)frame + (width * i);
885+
uint32_t *dst_row = (uint32_t*)(dst_base + (dst_pitch * i));
886+
unsigned int j;
887+
888+
for (j = 0; j < width; j++)
889+
{
890+
uint16_t src_pix = src_row[j];
891+
uint32_t r4 = (src_pix >> 12) & 0xF;
892+
uint32_t g4 = (src_pix >> 8) & 0xF;
893+
uint32_t b4 = (src_pix >> 4) & 0xF;
894+
uint32_t r8 = (r4 << 4) | r4;
895+
uint32_t g8 = (g4 << 4) | g4;
896+
uint32_t b8 = (b4 << 4) | b4;
897+
dst_row[j] = (r8 << 16) | (g8 << 8) | b8;
898+
}
870899
}
871-
memcpy(dst_base_addr + (dst_pitch * i), (char*)line, dst_pitch);
872900
}
873901

874-
/* We update the menu surface if menu is active. */
875-
drm_surface_update(_drmvars, frame_output, _drmvars->menu_surface);
902+
/* The bytes are in place in the dumb buffer; commit the page flip. */
903+
drm_page_flip(surface);
876904
}
877905

878906
static void drm_set_nonblock_state(void *a, bool b, bool c, unsigned d) { }

gfx/drivers/sunxi_gfx.c

Lines changed: 54 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -846,40 +846,70 @@ static void sunxi_set_texture_frame(void *data, const void *frame, bool rgb32,
846846
unsigned width, unsigned height, float alpha)
847847
{
848848
struct sunxi_video *_dispvars = (struct sunxi_video*)data;
849+
uint8_t *dst_base;
850+
unsigned int dst_pitch;
851+
unsigned int i;
849852

850-
if (_dispvars->menu_active)
851-
{
852-
unsigned int i, j;
853+
if (!_dispvars->menu_active)
854+
return;
853855

854-
/* We have to go on a pixel format conversion adventure for now, until we can
855-
* convince RGUI to output in an 8888 format. */
856-
unsigned int src_pitch = width * 2;
857-
unsigned int dst_pitch = _dispvars->sunxi_disp->xres * 4;
858-
unsigned int dst_width = _dispvars->sunxi_disp->xres;
859-
uint32_t line[dst_width];
856+
/* The display layer is xres pixels wide; we write `width` pixels
857+
* per row into the leading bytes of each destination row, leaving
858+
* any pixels beyond `width` untouched. RGUI typically renders at
859+
* the full display width, in which case the whole row is written. */
860+
dst_base = (uint8_t*)_dispvars->pages[0].address;
861+
dst_pitch = _dispvars->sunxi_disp->xres * 4;
860862

861-
/* Remember, memcpy() works with 8bits pointers for increments. */
862-
char *dst_base_addr = (char*)(_dispvars->pages[0].address);
863+
/* Defensive clamps: the page is xres wide and src_height tall.
864+
* Don't run off the end if the caller's frame is bigger. */
865+
{
866+
unsigned int max_w = _dispvars->sunxi_disp->xres;
867+
unsigned int max_h = (unsigned int)_dispvars->src_height;
868+
if (width > max_w) width = max_w;
869+
if (height > max_h) height = max_h;
870+
}
871+
872+
if (rgb32)
873+
{
874+
/* Source is already XRGB8888 -- per-row memcpy handles the
875+
* difference between source stride (width*4) and dst stride. */
876+
const uint8_t *src = (const uint8_t*)frame;
877+
unsigned int src_pitch = width * 4;
878+
unsigned int row_bytes = (src_pitch < dst_pitch) ? src_pitch : dst_pitch;
863879

880+
for (i = 0; i < height; i++)
881+
memcpy(dst_base + (dst_pitch * i), src + (src_pitch * i), row_bytes);
882+
}
883+
else
884+
{
885+
/* RGUI default output is RGBA4444 with channel layout
886+
* R = bits 15..12, G = 11..8, B = 7..4, A = 3..0
887+
* Expand each 4-bit channel to 8 bits via nibble replication
888+
* (x | (x << 4)) and pack into XRGB8888 for the display layer. */
864889
for (i = 0; i < height; i++)
865890
{
866-
for (j = 0; j < src_pitch / 2; j++)
891+
const uint16_t *src_row = (const uint16_t*)frame + (width * i);
892+
uint32_t *dst_row = (uint32_t*)(dst_base + (dst_pitch * i));
893+
unsigned int j;
894+
895+
for (j = 0; j < width; j++)
867896
{
868-
uint16_t src_pix = *((uint16_t*)frame + (src_pitch / 2 * i) + j);
869-
/* The hex AND is for keeping only the part we need for each component. */
870-
uint32_t R = (src_pix << 8) & 0x00FF0000;
871-
uint32_t G = (src_pix << 4) & 0x0000FF00;
872-
uint32_t B = (src_pix << 0) & 0x000000FF;
873-
line[j] = (0 | R | G | B);
897+
uint16_t src_pix = src_row[j];
898+
uint32_t r4 = (src_pix >> 12) & 0xF;
899+
uint32_t g4 = (src_pix >> 8) & 0xF;
900+
uint32_t b4 = (src_pix >> 4) & 0xF;
901+
uint32_t r8 = (r4 << 4) | r4;
902+
uint32_t g8 = (g4 << 4) | g4;
903+
uint32_t b8 = (b4 << 4) | b4;
904+
dst_row[j] = (r8 << 16) | (g8 << 8) | b8;
874905
}
875-
memcpy(dst_base_addr + (dst_pitch * i), (char*)line, dst_pitch);
876906
}
877-
878-
/* Issue pageflip. Will flip on next vsync. */
879-
sunxi_layer_set_rgb_input_buffer(_dispvars->sunxi_disp,
880-
_dispvars->sunxi_disp->bits_per_pixel,
881-
_dispvars->pages[0].offset, width, height, _dispvars->sunxi_disp->xres);
882907
}
908+
909+
/* Issue pageflip. Will flip on next vsync. */
910+
sunxi_layer_set_rgb_input_buffer(_dispvars->sunxi_disp,
911+
_dispvars->sunxi_disp->bits_per_pixel,
912+
_dispvars->pages[0].offset, width, height, _dispvars->sunxi_disp->xres);
883913
}
884914

885915
static void sunxi_set_aspect_ratio(void *data, unsigned aspect_ratio_idx)

0 commit comments

Comments
 (0)