Skip to content

Commit da5e650

Browse files
committed
gfx/vulkan: fix three heap-corruption bugs in driver and common layer + CI
Three independent issues found during a focused audit of gfx/drivers/ vulkan.c and gfx/common/vulkan_common.c, all in the same memory-safety class (writes past compile-time-sized arrays or undersized allocations). Submitting together because they share the same audit context and none stands alone particularly well. Adds AddressSanitizer regression tests for each under samples/gfx/, wired into a new .github/workflows/Linux-samples-gfx.yml CI job. * vulkan_find_device_extensions: heap overflow (gfx/common/vulkan_common.c) The required-extension list is currently written twice -- once via memcpy at the top of the body, then again in a per-element loop that appears to be left over from an earlier refactor. The instance-side counterpart vulkan_find_instance_extensions only logs in its loop, confirming the divergence. The redundant write consumes more slots in the caller's `enabled` buffer than the slot accounting expects. Inside vulkan_context_create_device_wrapper the buffer is sized for (info.enabledExtensionCount + ARRAY_SIZE(required) + ARRAY_SIZE(optional)) entries, but the function actually writes (enabledExtensionCount + 2*required + optional) entries worst-case. With current values (1 required, 3 optional) that is a one-element heap overflow at the end of the malloc'd block whenever a libretro core uses the Vulkan HW context-negotiation interface and the GPU supports at least one optional extension. The other call site in vulkan_context_init_device uses a stack buffer of size 8 and stays inside it today, but only by a margin of three. The duplicate writes also produce duplicate entries in ppEnabledExtensionNames passed to vkCreateDevice, which the spec forbids (VUID-VkDeviceCreateInfo-ppEnabledExtensionNames-01840); strict validation layers will have been complaining. Drop the per-element loop -- the early vulkan_find_extensions() check at the top of the function already validates that all required extensions are present, so the per-element re-check has never done any work. Keep the debug log. * vulkan_create_swapchain: OOB writes from unclamped image count (gfx/common/vulkan_common.c) vulkan_create_swapchain calls vkGetSwapchainImagesKHR twice -- first to query the count, then to fill swapchain_images[] -- and never clamps the count between the two calls. swapchain_images, swapchain_fences, the four swapchain_*_semaphores arrays, readback.staging[] and vk->swapchain[] are all sized at compile time to VULKAN_MAX_SWAPCHAIN_IMAGES (8). If a driver returns more images than that the second vkGetSwapchainImagesKHR writes past swapchain_images[], and every loop bounded by num_swapchain_images (~12 sites across init/deinit/textures/ buffers/descriptor pools/command buffers/readback and direct vk->swapchain[i] accesses) walks past its array. Two clamps. First, cap desired_swapchain_images to VULKAN_MAX_SWAPCHAIN_IMAGES before vkCreateSwapchainKHR so we never ask a well-behaved driver for more images than we can hold. Second, clamp num_swapchain_images between the two vkGetSwapchainImagesKHR calls to handle drivers that return more images than were requested -- the spec permits this. * vulkan_create_texture: 32-bit overflow in staging-buffer sizing (gfx/drivers/vulkan.c) buffer_width = width * bpp and buffer_info.size = buffer_width * height are both unsigned * unsigned in 32-bit before being widened to VkDeviceSize on assignment. With dimensions large enough to wrap (e.g. width=65536, height=16385, bpp=4 -> 0x1_0004_0000 wraps to 0x40000), the staging buffer is allocated at the wrapped (small) size while the per-row upload memcpy loop later in the same function walks the unmodified width x height. The mapped region is smaller than what the loop writes, so the memcpy walks past the mapping into adjacent heap memory. Reachable from libretro cores supplying oversized retro_framebuffer dimensions and from vulkan_load_texture / vulkan_set_texture_frame, which take dimensions originating in image decoders. Compute the staging-buffer size in 64-bit, and widen the upload loop's stride and per-row copy size to size_t. VkDeviceSize is already 64-bit so the assignment to buffer_info.size is now correct end-to-end. The loop's pointer math was already implicitly widened on 64-bit hosts but is now unconditionally correct on 32-bit hosts (3DS, Vita, PSP, Wii, Wii U, older Android, 32-bit Windows) as well. * samples/gfx/, .github/workflows/Linux-samples-gfx.yml Three regression tests, one per fix above, following the pattern established by the security regression tests under samples/tasks/ (archive_name_safety_test, http_method_match_test, video_shader_wildcard_test, input_remap_bounds_test, bsv_replay_bounds_test, bps_patch_bounds_test). Each test keeps a verbatim copy of the relevant post-fix predicate, demarcated by `=== verbatim copy ===` markers; an `IMPORTANT: ... must follow` comment documents the convention. Plain C, asserts manually, exits nonzero on failure -- matches the harness the existing Linux-samples-{tasks,libretro-{common,db}}-samples.yml workflows understand. vulkan_extension_count_test (5 cases): create_device_wrapper- shaped caller, init_device-shaped caller, GPU supporting no optional extensions, GPU missing a required extension, and the prepushed-via-negotiation-interface case. Verified pre/post- patch discrimination: under the pre-fix double-write shape the create_device_wrapper case ASan-aborts with heap-buffer-overflow. vulkan_swapchain_clamp_test (5 cases): well-behaved 3-image driver, at-capacity boundary, 9-image (MAX+1) driver, pathological 64-image driver, and the request-side cap across six input values. Verified pre/post-patch discrimination: under the pre-fix no-clamp shape the 9-image case ASan-aborts on the second vkGetSwapchainImagesKHR fill. vulkan_texture_size_test (6 cases): typical 320x240, 4K RGBA, the smoking-gun 65536x16385x4 wrap case, an assorted-shapes table, the per-row upload-loop strides under an ASan-instrumented buffer, and a 64-bit-result sanity check. Verified pre/post- patch discrimination: under the pre-fix 32-bit arithmetic the overflow cases produce arithmetic mismatches against the 64-bit reference. Linux-samples-gfx.yml is a near-verbatim copy of Linux-samples-tasks.yml, runs each test under -fsanitize=address with a per-step explanation of what the test is regression-testing and the `verbatim copy must follow` reminder. ~3 seconds per test on Ubuntu 24 with system gcc.
1 parent 2d7b7ed commit da5e650

9 files changed

Lines changed: 1378 additions & 16 deletions

File tree

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
name: CI Linux samples/gfx
2+
3+
on:
4+
push:
5+
branches:
6+
- master
7+
pull_request:
8+
branches:
9+
- master
10+
workflow_dispatch:
11+
12+
permissions:
13+
contents: read
14+
15+
env:
16+
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
17+
18+
jobs:
19+
samples-gfx:
20+
name: Build and run samples/gfx
21+
runs-on: ubuntu-latest
22+
timeout-minutes: 10
23+
24+
steps:
25+
- name: Install dependencies
26+
run: |
27+
sudo apt-get update -y
28+
sudo apt-get install -y build-essential
29+
30+
- name: Checkout
31+
uses: actions/checkout@v3
32+
33+
- name: Build and run vulkan_extension_count_test (ASan)
34+
shell: bash
35+
working-directory: samples/gfx/vulkan_extension_count
36+
run: |
37+
set -eu
38+
# Regression test for the heap-overflow fix in
39+
# gfx/common/vulkan_common.c::vulkan_find_device_extensions.
40+
# Pre-fix the function appended the required-extension list
41+
# twice -- once via memcpy at the top of the body, then
42+
# again in a per-element loop -- consuming
43+
# (count_initial + 2*num_required + num_optional) slots in
44+
# the caller's buffer. vulkan_context_create_device_wrapper
45+
# sized its malloc for (count_initial + num_required +
46+
# num_optional) entries, so a libretro core using the Vulkan
47+
# HW context-negotiation interface against a GPU exposing at
48+
# least one optional extension hit a one-element heap-buffer-
49+
# overflow (8 bytes on 64-bit) at the end of the malloc'd
50+
# block. Build under AddressSanitizer so any reintroduction
51+
# of the duplicate write is caught at the bounds level. If
52+
# vulkan_common.c amends the append-to-enabled[] block, the
53+
# verbatim copy in vulkan_extension_count_test.c must follow.
54+
make clean all SANITIZER=address
55+
test -x vulkan_extension_count_test
56+
timeout 60 ./vulkan_extension_count_test
57+
echo "[pass] vulkan_extension_count_test"
58+
59+
- name: Build and run vulkan_swapchain_clamp_test (ASan)
60+
shell: bash
61+
working-directory: samples/gfx/vulkan_swapchain_clamp
62+
run: |
63+
set -eu
64+
# Regression test for the unclamped-swapchain-image-count
65+
# fix in gfx/common/vulkan_common.c::vulkan_create_swapchain.
66+
# Pre-fix the two vkGetSwapchainImagesKHR calls (count
67+
# query + image fill) had no clamp between them, so a driver
68+
# returning more than VULKAN_MAX_SWAPCHAIN_IMAGES (8) images
69+
# on the second call wrote past context.swapchain_images[8]
70+
# and every loop bounded by num_swapchain_images walked past
71+
# its compile-time-sized companion array (~12 such loops
72+
# across init/deinit/textures/buffers/descriptor pools/
73+
# command buffers/readback and direct vk->swapchain[i]
74+
# uses). Build under AddressSanitizer so any reintroduction
75+
# of either the request-side or the post-create clamp is
76+
# caught at the bounds level. If vulkan_common.c amends
77+
# the cap or the post-create clamp, the verbatim copies in
78+
# vulkan_swapchain_clamp_test.c must follow.
79+
make clean all SANITIZER=address
80+
test -x vulkan_swapchain_clamp_test
81+
timeout 60 ./vulkan_swapchain_clamp_test
82+
echo "[pass] vulkan_swapchain_clamp_test"
83+
84+
- name: Build and run vulkan_texture_size_test (ASan)
85+
shell: bash
86+
working-directory: samples/gfx/vulkan_texture_size
87+
run: |
88+
set -eu
89+
# Regression test for the 32-bit-overflow fix in
90+
# gfx/drivers/vulkan.c::vulkan_create_texture's staging-
91+
# buffer sizing. Pre-fix the buffer_info.size calculation
92+
# (buffer_width * height) was unsigned*unsigned in 32-bit
93+
# before being widened to VkDeviceSize on assignment. With
94+
# dimensions large enough to wrap (e.g. 65536x16385x4 ->
95+
# 0x1_0004_0000 truncates to 0x40000), the staging buffer
96+
# was malloc'd at the wrapped (small) size while the per-row
97+
# upload memcpy loop walked the full width x height,
98+
# writing past the mapped region into adjacent heap memory.
99+
# Reachable from libretro cores supplying oversized
100+
# retro_framebuffer dimensions and from vulkan_load_texture
101+
# / vulkan_set_texture_frame. Build under AddressSanitizer
102+
# so any reintroduction of the 32-bit arithmetic is caught
103+
# at the bounds level. If vulkan.c amends the size
104+
# calculation or upload-loop strides, the verbatim copies
105+
# in vulkan_texture_size_test.c must follow.
106+
make clean all SANITIZER=address
107+
test -x vulkan_texture_size_test
108+
timeout 60 ./vulkan_texture_size_test
109+
echo "[pass] vulkan_texture_size_test"

gfx/common/vulkan_common.c

Lines changed: 27 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -432,21 +432,12 @@ static bool vulkan_find_device_extensions(VkPhysicalDevice gpu,
432432
goto end;
433433
}
434434

435+
/* Required extensions: presence already validated by the
436+
* vulkan_find_extensions() check above. Append in one shot. */
435437
memcpy((void*)(enabled + count), exts, num_exts * sizeof(*exts));
436-
count += num_exts;
437-
438438
for (i = 0; i < num_exts; i++)
439-
{
440-
if (vulkan_find_extensions(&exts[i], 1, properties, property_count))
441-
{
442-
RARCH_DBG("[Vulkan] Device extension supported: %s.\n", exts[i]);
443-
enabled[count++] = exts[i];
444-
}
445-
else
446-
{
447-
RARCH_DBG("[Vulkan] Device extension NOT supported: %s.\n", exts[i]);
448-
}
449-
}
439+
RARCH_DBG("[Vulkan] Device extension supported: %s.\n", exts[i]);
440+
count += num_exts;
450441

451442
for (i = 0; i < num_optional_exts; i++)
452443
{
@@ -2300,6 +2291,15 @@ bool vulkan_create_swapchain(gfx_ctx_vulkan_data_t *vk,
23002291
&& (desired_swapchain_images > surface_properties.maxImageCount))
23012292
desired_swapchain_images = surface_properties.maxImageCount;
23022293

2294+
/* Cap our request to what we can actually hold. Per-image arrays
2295+
* (swapchain_images, swapchain_fences, the various semaphore
2296+
* arrays, vk->swapchain[], readback.staging[]) are all sized to
2297+
* VULKAN_MAX_SWAPCHAIN_IMAGES at compile time. The post-create
2298+
* fill below also clamps defensively in case a driver returns
2299+
* more images than requested. */
2300+
if (desired_swapchain_images > VULKAN_MAX_SWAPCHAIN_IMAGES)
2301+
desired_swapchain_images = VULKAN_MAX_SWAPCHAIN_IMAGES;
2302+
23032303
if (surface_properties.supportedTransforms
23042304
& VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR)
23052305
pre_transform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
@@ -2415,6 +2415,20 @@ bool vulkan_create_swapchain(gfx_ctx_vulkan_data_t *vk,
24152415

24162416
vkGetSwapchainImagesKHR(vk->context.device, vk->swapchain,
24172417
&vk->context.num_swapchain_images, NULL);
2418+
2419+
/* Even after capping minImageCount above, drivers may legally
2420+
* return more images than requested. Clamp before the fill call
2421+
* so we don't write past swapchain_images[] and so every
2422+
* downstream loop bounded by num_swapchain_images stays inside
2423+
* its compile-time-sized array. */
2424+
if (vk->context.num_swapchain_images > VULKAN_MAX_SWAPCHAIN_IMAGES)
2425+
{
2426+
RARCH_WARN("[Vulkan] Swapchain returned %u images, clamping to %u.\n",
2427+
vk->context.num_swapchain_images,
2428+
(unsigned)VULKAN_MAX_SWAPCHAIN_IMAGES);
2429+
vk->context.num_swapchain_images = VULKAN_MAX_SWAPCHAIN_IMAGES;
2430+
}
2431+
24182432
vkGetSwapchainImagesKHR(vk->context.device, vk->swapchain,
24192433
&vk->context.num_swapchain_images, vk->context.swapchain_images);
24202434

gfx/drivers/vulkan.c

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1017,6 +1017,7 @@ static struct vk_texture vulkan_create_texture(vk_t *vk,
10171017
enum vk_texture_type type)
10181018
{
10191019
unsigned i;
1020+
uint64_t buffer_size_64;
10201021
uint32_t buffer_width;
10211022
struct vk_texture tex;
10221023
VkImageCreateInfo info;
@@ -1051,11 +1052,18 @@ static struct vk_texture vulkan_create_texture(vk_t *vk,
10511052
/* Align stride to 4 bytes to make sure we can use compute shader uploads without too many problems. */
10521053
buffer_width = width * vulkan_format_to_bpp(format);
10531054
buffer_width = (buffer_width + 3u) & ~3u;
1055+
/* Compute the buffer size in 64-bit. width*bpp*height as a 32-bit
1056+
* unsigned would wrap for sufficiently large dimensions (e.g. an
1057+
* upscaled shader render target chain), leaving the staging buffer
1058+
* underallocated relative to the upload memcpy loop further down
1059+
* and producing a heap overflow on the host side. VkDeviceSize is
1060+
* 64-bit; widen the math to match. */
1061+
buffer_size_64 = (uint64_t)buffer_width * (uint64_t)height;
10541062

10551063
buffer_info.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
10561064
buffer_info.pNext = NULL;
10571065
buffer_info.flags = 0;
1058-
buffer_info.size = buffer_width * height;
1066+
buffer_info.size = buffer_size_64;
10591067
buffer_info.usage = 0;
10601068
buffer_info.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
10611069
buffer_info.queueFamilyIndexCount = 0;
@@ -1365,14 +1373,18 @@ static struct vk_texture vulkan_create_texture(vk_t *vk,
13651373
const uint8_t *src = NULL;
13661374
void *ptr = NULL;
13671375
unsigned bpp = vulkan_format_to_bpp(tex.format);
1368-
unsigned stride = tex.width * bpp;
1376+
/* Source stride and per-row copy size in size_t to keep
1377+
* the pointer math and memcpy length safe even when
1378+
* width*bpp would otherwise wrap a 32-bit unsigned. */
1379+
size_t stride = (size_t)tex.width * (size_t)bpp;
1380+
size_t row_bytes = (size_t)width * (size_t)bpp;
13691381

13701382
vkMapMemory(device, tex.memory, tex.offset, tex.size, 0, &ptr);
13711383

13721384
dst = (uint8_t*)ptr;
13731385
src = (const uint8_t*)initial;
13741386
for (y = 0; y < tex.height; y++, dst += tex.stride, src += stride)
1375-
memcpy(dst, src, width * bpp);
1387+
memcpy(dst, src, row_bytes);
13761388

13771389
if ( (tex.flags & VK_TEX_FLAG_NEED_MANUAL_CACHE_MANAGEMENT)
13781390
&& (tex.memory != VK_NULL_HANDLE))
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
TARGET := vulkan_extension_count_test
2+
3+
SOURCES := vulkan_extension_count_test.c
4+
5+
OBJS := $(SOURCES:.c=.o)
6+
7+
CFLAGS += -Wall -pedantic -std=gnu99 -g -O0
8+
9+
ifneq ($(SANITIZER),)
10+
CFLAGS := -fsanitize=$(SANITIZER) -fno-omit-frame-pointer $(CFLAGS)
11+
LDFLAGS := -fsanitize=$(SANITIZER) $(LDFLAGS)
12+
endif
13+
14+
all: $(TARGET)
15+
16+
%.o: %.c
17+
$(CC) -c -o $@ $< $(CFLAGS)
18+
19+
$(TARGET): $(OBJS)
20+
$(CC) -o $@ $^ $(LDFLAGS)
21+
22+
clean:
23+
rm -f $(TARGET) $(OBJS)
24+
25+
.PHONY: clean

0 commit comments

Comments
 (0)