Skip to content

Soundness: Latent Use-After-Free due to cinfo back-pointers and by-value move in start_compress #52

@ain5721

Description

@ain5721

Hi team! Thanks for the amazing work on mozjpeg-rust.

While reviewing the FFI boundaries, I noticed a subtle architectural pattern that might lead to a latent Use-After-Free (UAF) or memory corruption in specific configurations.

I would love to share my observations and discuss a potential lightweight fix.

The Observation

The core of the issue lies in how the jpeg_compress_struct (cinfo) is handled during the transition into the compression phase.

1. The Rust Side (The Move):
In Compress::start_compress(self, ...), the Compress struct is taken by value:

pub fn start_compress(self, writer: W) -> io::Result<CompressStarted<W>>

When this function returns CompressStarted, self (and thus the cinfo struct) is physically moved to a new stack location. I noticed the clever use of _pinned: PhantomPinned to ward off nasal demons for self-referential structs. However, because start_compress consumes self by value, the PhantomPinned marker unfortunately cannot prevent the physical memory address from changing during the return.

2. The C Side (The Back-link Storage):
During initialization (which happens inside start_compress), underlying C modules often allocate long-lived state on the heap and store a "back-link" to the cinfo structure. For example, in jcphuff.c:

// jcphuff.c: start_pass_phuff
entropy->cinfo = cinfo; // Stores the pre-move stack pointer

3. The UAF Trigger:
Later, during routines like write_scanlines, subroutines may rely on this stored back-link rather than the newly passed pointer. For instance:

// jcphuff.c: dump_buffer
struct jpeg_destination_mgr *dest = entropy->cinfo->dest; // Dereferences the stored back-link

The Concern

Because start_compress moves the cinfo struct, the back-link stored in the heap-allocated C modules remains pointing to the old, invalidated stack location.

And I think modern compilers often perform Return Value Optimization (RVO), which might mask this issue by eliding the physical copy in simple scenarios. Additionally, in standard Sequential mode, some local states are used which bypass the back-link. However, relying on RVO for memory safety is generally considered unsound in Rust. In cases where moves do occur (e.g., complex call stacks, specific async wrappers, or without optimizations), the C library will dereference a dangling pointer to a popped stack frame.

Proposed Solution

To stabilize the memory address while fulfilling the requirements for internal C back-pointers, we can internally "box" the Compress struct within CompressStarted:

pub struct CompressStarted<W> {
    compress: Box<Compress>, // Stable heap address for the lifetime of the compression
    dest_mgr: DestinationMgr<W>,
}

By moving self to the heap via Box::new(self) before passing it to jpeg_start_compress, we ensure that the physical address seen by libjpeg remains stable, even when the CompressStarted wrapper itself is moved around in Rust.

I have tested this locally: it passes the existing test suite, completely resolves the memory escape, and the performance overhead of a single Box allocation is virtually unmeasurable in the context of JPEG compression.

I will open a Pull Request shortly with this proposed fix so you can evaluate the code directly. I'd love to hear your thoughts on this! Thank you again for your hard work on this crate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions