Skip to content

Commit 25285e4

Browse files
Add MemCx<'current, T> arguments and PBox<'current, T> returns (pgcentralfoundation#2210)
tl;dr: `caller_picks_lifetime + macro_expansion == unsoundness`, so prevent that by adding API for new, lifetime-bound allocations. For functions exposed to PostgreSQL via `pg_extern`, signatures often resemble these: ```rust #[pg_extern] fn do_something( floats: Array<'a, f32>, hms: Time, day: TimestampTz ) -> Vec<TimestampTz> { ``` ```sql -- pgrx will generate something like this CREATE FUNCTION do_something( floats real[], hms time, day timestamptz ) RETURNS timestamptz[] ``` In SQL, we accept an array and return an array of a different type, but in Rust, we return a `Vec`. This is because we lack API to allocate and return a fresh `Array<'a, T>`s. Behind the scenes, we *do* allocate an `ArrayType` at the translation boundary, unseen to a Rust programmer. That, plus the Vec, means two allocations just to mutate an array, even for a fixed-size type! Unfortunately, pgrx-offered type translations like `Array<'a, T>`, cannot have a lifetime *and* expose a new *safe* function to make it from lifetime-free values! Such API must have the `unsafe` requirement of access-exclusivity for its lifetime, described in functions like `slice::from_raw_parts_mut`[^1], because the lifetime becomes caller-selected. This makes it easy to use a `'static` lifetime with `pg_extern` and expand `unsafe` code that trusts that lifetime. A `'static` lifetime is always wrong for `palloc`ated types, even with `TopMemoryContext`, since it dies whenever Postgres resets the context and Postgres does have conditions under which it performs such a global reset. The result is various extremely subtle soundness problems are possible, yet the root cause lies in code that most never read. We can solve this by denying the possibility of a caller-selected lifetime. This will require removing some types, ultimately, but first we need an API to migrate to. This new API is `MemCx<'mcx, T>` and `PBox<'mcx, T>`. Importantly, there is no *safe* route to creating `MemCx<'mcx, T>` with a caller-selected lifetime. One can only reach `MemCx<'mcx, T>` safely as an argument: either pass a closure to `memcx::current_context` or call a new `pg_extern` function via SQL. Either way gives you `&MemCx<'current>`. These let you create a lifetime-bound allocation that can live long enough to return it. We can already return fixed-size pass-by-reference types, so right now `PBox` is only an optimization. It allows not putting a type on the stack and then immediately moving it directly into a heap allocation. More importantly, it enables writing tests, so it is important ground work for returning de facto unsized values, as they can hide behind `PBox` which is always `Sized`. [^1]: https://doc.rust-lang.org/std/slice/fn.from_raw_parts_mut.html
1 parent 5e87cce commit 25285e4

8 files changed

Lines changed: 197 additions & 1 deletion

File tree

pgrx-tests/src/tests/memcxt_tests.rs

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ mod tests {
1414
use crate as pgrx_tests;
1515

1616
use pgrx::PgMemoryContexts;
17+
use pgrx::memcx::MemCx;
18+
use pgrx::palloc::PBox;
1719
use pgrx::pg_test;
1820
use pgrx::prelude::*;
1921
use std::sync::Arc;
@@ -29,6 +31,15 @@ mod tests {
2931
}
3032
}
3133

34+
#[pg_extern]
35+
pub fn accept_mcx_return_timetz<'mcx>(
36+
memcx: &'mcx MemCx<'mcx>,
37+
) -> PBox<'mcx, TimeWithTimeZone> {
38+
let palloc = memcx.alloc_bytes(size_of::<TimeWithTimeZone>());
39+
let timetz = TimeWithTimeZone::new(4, 20, 0.0).unwrap();
40+
PBox::new_in(timetz, memcx)
41+
}
42+
3243
#[pg_test]
3344
fn test_leak_and_drop() {
3445
let did_drop = Arc::new(AtomicBool::new(false));
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
use pgrx::memcx::MemCx;
2+
use pgrx::palloc::PBox;
3+
use pgrx::prelude::*;
4+
5+
#[pg_extern]
6+
pub fn accept_mcx_return_timetz(memcx: &MemCx<'static>) -> PBox<'static, TimeWithTimeZone> {
7+
let timetz = TimeWithTimeZone::new(4, 20, 0.0).unwrap();
8+
PBox::new_in(timetz, memcx)
9+
}
10+
11+
fn main() {}
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
error[E0521]: borrowed data escapes outside of function
2+
--> tests/compile-fail/no-static-memcx.rs:6:92
3+
|
4+
5 | #[pg_extern]
5+
| ------------
6+
| |
7+
| lifetime `'fcx` defined here
8+
| in this procedural macro expansion
9+
6 | pub fn accept_mcx_return_timetz(memcx: &MemCx<'static>) -> PBox<'static, TimeWithTimeZone> {
10+
| ____________________________________________________________________________________________^
11+
7 | | let timetz = TimeWithTimeZone::new(4, 20, 0.0).unwrap();
12+
8 | | PBox::new_in(timetz, memcx)
13+
9 | | }
14+
| | ^
15+
| | |
16+
| | `fcinfo` is a reference that is only valid in the function body
17+
| |_`fcinfo` escapes the function body here
18+
| argument requires that `'fcx` must outlive `'static`
19+
|
20+
= note: this error originates in the attribute macro `pg_extern` (in Nightly builds, run with -Z macro-backtrace for more info)

pgrx/src/callconv.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ pub unsafe trait ArgAbi<'fcx>: Sized {
9191
}
9292
}
9393

94-
/// "Is this a virtual arg?"
94+
/// Virtual args do not affect Datum arg iteration
9595
fn is_virtual_arg() -> bool {
9696
false
9797
}

pgrx/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ pub mod misc;
6363
pub mod namespace;
6464
pub mod nodes;
6565
pub mod nullable;
66+
pub mod palloc;
6667
pub mod pg_catalog;
6768
pub mod pgbox;
6869
pub mod rel;

pgrx/src/memcx.rs

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
11
//! Memory Contexts in PostgreSQL, now with lifetimes.
2+
use pgrx_sql_entity_graph::metadata::{
3+
ArgumentError, Returns, ReturnsError, SqlMapping, SqlTranslatable,
4+
};
5+
26
// "Why isn't this pgrx::mem or pgrx::memcxt?"
37
// Postgres actually uses all of:
48
// - mcxt
59
// - memcxt
610
// - mctx
711
// Search engines will see "memc[tx]{2}" and assume you mean memcpy!
812
// And it's nice-ish to have shorter lifetime names and have 'mcx consistently mean the lifetime.
13+
use crate::callconv::{Arg, ArgAbi};
14+
use crate::nullable::Nullable;
915
use crate::pg_sys;
1016
use core::{marker::PhantomData, ptr::NonNull};
1117

1218
/// A borrowed memory context.
19+
#[repr(transparent)]
1320
pub struct MemCx<'mcx> {
1421
ptr: NonNull<pg_sys::MemoryContextData>,
1522
_marker: PhantomData<&'mcx pg_sys::MemoryContextData>,
@@ -118,3 +125,38 @@ mod nightly {
118125
}
119126
}
120127
}
128+
129+
unsafe impl<'fcx> ArgAbi<'fcx> for &MemCx<'fcx> {
130+
unsafe fn unbox_arg_unchecked(_arg: Arg<'_, 'fcx>) -> Self {
131+
// SAFETY: We are called to unbox an argument, which means the backend was initialized.
132+
// We use this horrific expression to allow the lifetime to be extended arbitrarily
133+
// and achieve an "in-place" transformation of CurrentMemoryContext's pointer.
134+
// The soundness of this is riding on the lifetimes used for `unbox_arg_unchecked` in our macros,
135+
// as the expanded code is designed so that `fcinfo` and each `arg` are truly "borrowed" in rustc's eyes.
136+
unsafe { &*((&raw mut pg_sys::CurrentMemoryContext).cast()) }
137+
}
138+
139+
unsafe fn unbox_nullable_arg(arg: Arg<'_, 'fcx>) -> Nullable<Self> {
140+
// SAFETY: Should never happen in actuality, but as long as we're here...
141+
if unsafe { pg_sys::CurrentMemoryContext.is_null() } {
142+
Nullable::Null
143+
} else {
144+
Nullable::Valid(Self::unbox_arg_unchecked(arg))
145+
}
146+
}
147+
148+
fn is_virtual_arg() -> bool {
149+
true
150+
}
151+
}
152+
153+
/// SAFETY: virtual argument
154+
unsafe impl<'mcx> SqlTranslatable for &MemCx<'mcx> {
155+
fn argument_sql() -> Result<SqlMapping, ArgumentError> {
156+
Ok(SqlMapping::Skip)
157+
}
158+
159+
fn return_sql() -> Result<Returns, ReturnsError> {
160+
Ok(Returns::One(SqlMapping::Skip))
161+
}
162+
}

pgrx/src/palloc.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
mod pbox;
2+
3+
pub use pbox::PBox;

pgrx/src/palloc/pbox.rs

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
use crate::callconv::{BoxRet, FcInfo};
2+
use crate::datum::{BorrowDatum, Datum};
3+
use crate::layout::PassBy;
4+
use crate::memcx::MemCx;
5+
use crate::pg_sys;
6+
use core::marker::PhantomData;
7+
use core::ops::{Deref, DerefMut};
8+
use core::ptr::NonNull;
9+
10+
use pgrx_sql_entity_graph::metadata::{
11+
ArgumentError, Returns, ReturnsError, SqlMapping, SqlTranslatable,
12+
};
13+
14+
/** As [`Box<T, A>`][stdbox] where `A` is a [`MemCx`]
15+
16+
17+
[stdbox]: alloc::boxed::Box
18+
*/
19+
#[repr(transparent)]
20+
pub struct PBox<'mcx, T: ?Sized> {
21+
ptr: NonNull<T>,
22+
_cx: PhantomData<MemCx<'mcx>>,
23+
}
24+
25+
impl<'mcx, T: ?Sized> PBox<'mcx, T> {
26+
// # Safety
27+
// The same constraints as [`Box::from_raw`], AND
28+
// - you assert the pointer was allocated in the `MemCx`
29+
// - you assert the pointer may be freed by `pfree`
30+
pub unsafe fn from_raw_in(ptr: NonNull<T>, _cx: &MemCx<'mcx>) -> PBox<'mcx, T> {
31+
PBox { ptr, _cx: PhantomData }
32+
}
33+
}
34+
35+
impl<'mcx, T: Sized> PBox<'mcx, T> {
36+
pub fn new_in(val: T, memcx: &MemCx<'mcx>) -> Self {
37+
const { assert!(align_of::<T>() <= size_of::<pg_sys::Datum>()) };
38+
let ptr = memcx.alloc_bytes(size_of::<T>()).cast();
39+
// SAFETY: We were guaranteed an appropriately sized allocation to write to,
40+
// and we have asserted our alignment maximum was upheld
41+
unsafe { ptr.write(val) };
42+
PBox { ptr, _cx: PhantomData }
43+
}
44+
}
45+
46+
unsafe impl<'mcx, T> BoxRet for PBox<'mcx, T>
47+
where
48+
T: ?Sized + BorrowDatum,
49+
{
50+
unsafe fn box_into<'fcx>(self, fcinfo: &mut FcInfo<'fcx>) -> Datum<'fcx> {
51+
let datum = match T::PASS {
52+
PassBy::Value => {
53+
// start with a zeroed Datum, just to minimize funny business
54+
let mut datum = pg_sys::Datum::null();
55+
// SAFETY: Due to BorrowDatum, this type has a definite size less than a Datum,
56+
// and PBox must have an initialized pointee, so a copy is sound-by-construction.
57+
unsafe {
58+
let size = size_of_val(&*self.ptr.as_ptr());
59+
debug_assert!(size <= size_of::<pg_sys::Datum>());
60+
// using `BorrowDatum::point_from` handles endianness
61+
let datum_ptr = T::point_from(NonNull::from_mut(&mut datum).cast::<u8>());
62+
datum_ptr.cast::<u8>().copy_from_nonoverlapping(self.ptr.cast(), size);
63+
}
64+
datum
65+
}
66+
PassBy::Ref => pg_sys::Datum::from(self.ptr.cast::<u8>().as_ptr()),
67+
};
68+
// SAFETY: by proxy, BorrowDatum is an `unsafe trait` so the above impl must be correct
69+
unsafe { fcinfo.return_raw_datum(datum) }
70+
}
71+
}
72+
73+
/// SAFETY: SQL has no "pointers" so by-val and by-ref calling conventions are identical,
74+
/// and all `PBox` truly does is enable pass-by-ref returns of unsized values.
75+
unsafe impl<'mcx, T> SqlTranslatable for PBox<'mcx, T>
76+
where
77+
T: SqlTranslatable + ?Sized,
78+
{
79+
fn argument_sql() -> Result<SqlMapping, ArgumentError> {
80+
T::argument_sql()
81+
}
82+
83+
fn return_sql() -> Result<Returns, ReturnsError> {
84+
T::return_sql()
85+
}
86+
}
87+
88+
impl<'mcx, T> Deref for PBox<'mcx, T>
89+
where
90+
T: BorrowDatum + ?Sized,
91+
{
92+
type Target = T;
93+
94+
fn deref(&self) -> &Self::Target {
95+
// SAFETY: by construction
96+
unsafe { self.ptr.as_ref() }
97+
}
98+
}
99+
100+
impl<'mcx, T> DerefMut for PBox<'mcx, T>
101+
where
102+
T: BorrowDatum + ?Sized,
103+
{
104+
fn deref_mut(&mut self) -> &mut Self::Target {
105+
// SAFETY: by construction
106+
unsafe { self.ptr.as_mut() }
107+
}
108+
}

0 commit comments

Comments
 (0)