Skip to content

Commit 42f3a3c

Browse files
committed
Metal: add SDR offscreen for menu composite in HDR mode
Fixes menu / overlay / OSD appearing incorrectly on the HDR backbuffer — elevated blacks on scRGB, dim on HDR10 — caused by stock SDR pipelines writing sRGB-encoded bytes directly into a PQ or linear-FP16 drawable. Architecture mirrors Vulkan's equivalent fix (12d188e, Jan 2026): menu / overlay / OSD / widgets render into a BGRA8 SDR offscreen; at end of frame, two sequential composite passes fold the core video and the SDR UI into the HDR drawable. Render pipeline (HDR mode) shader chain final pass / frame texture (core source) | v hdrComposite pass 1 (core, blending off, clear) | + hdrComposite pass 2 (menu, alpha blend, load): | sample _sdrOverlayTex | v CAMetalLayer drawable (HDR10 / scRGB) menu / overlay / OSD draws -> _sdrOverlayTex (BGRA8, drawable- sized, cleared to transparent black at rce acquire) Key changes * Stock / clear / menu / font pipelines now compile against BGRA8Unorm unconditionally. Matches drawable format in SDR mode (identical behaviour) and matches _sdrOverlayTex in HDR mode. * Context rce getter is HDR-aware: in HDR mode it lazily opens a render encoder on _sdrOverlayTex cleared to transparent black and sets an _sdrOverlayDirty flag; SDR mode unchanged. * hdrComposite does two sequential passes on the drawable: - Core pass: blending off, load=Clear. Composite fragment emits alpha=0 outside CoreViewport so the clear colour is preserved in letterbox / pillarbox areas. When the caller has no core source (fresh driver init, pre-first-frame), degenerates to a clear-only pass so the drawable is never presented with uninitialised memory. - Menu pass: blending on (SRC_ALPHA / ONE_MINUS_SRC_ALPHA), load=Load. Runs only when _sdrOverlayDirty is set (skipped if no UI was drawn this frame). Samples _sdrOverlayTex with SDR-source semantics and menu-specific uniforms (BrightnessNits <- PaperWhiteNits, InverseTonemap forced on in both HDR modes so the sRGB-decode path runs and the scRGB passthrough shortcut is bypassed, Scanlines cleared). Metal's blend unit alpha-composites the encoded result over the core. * Composite fragment reworked: - Takes a CoreViewport (float4: xy=origin, zw=size in pixels) uniform and covers the full drawable. Inside the rect the mode branches run against remapped UVs; outside the rect it emits float4(0,0,0,0). - Re-shaped but logically equivalent mode-1/2/3 branches preserve shader-emitted-HDR passthrough for core content. * New HDRUniforms field PaperWhiteNits drives UI paper-white independently of core paper-white (BrightnessNits). setHDRMenuNits now populates it (previously a no-op — the dropped _pad0 slot is replaced). * Two new pipeline variants per HDR output mode — menu-composite pipelines with blending enabled — created through a shared makeComposite block. Readiness check at HDR enable updated. * Context ivars: _sdrOverlayTex (BGRA8, Private, drawable-sized), _sdrOverlayW, _sdrOverlayH, _sdrOverlayDirty. Allocated in resizeHDRResourcesForWidth:height: alongside _hdrReadbackTex, freed when HDR is disabled. * MetalDriver.setViewportWidth:height: propagates to Context.resizeHDRResourcesForWidth:height: so the SDR overlay tracks window resizes. _resizeHDRResourcesForWidth:height: renamed to the public resizeHDRResourcesForWidth:height: and declared in metal_common.h. * renderFrame refactored: single lazy rce acquisition at top (routes correctly by mode); menu / overlay / OSD / widgets / message all draw via this rce (into SDR overlay in HDR, into drawable in SDR); hdrComposite fires at end-of-frame just before _endFrame. The mid-frame composite + follow-on drawable rce dance is gone. Result scRGB menu no longer shows elevated blacks / medium-gray backgrounds — the overlay now decodes as sRGB, scales against menu paper-white, and blends standardly. HDR10 menu remains visually correct. HDR-to-HDR mode switches (HDR10 <-> scRGB) no longer flash the drawable green or blue on the first post-reinit frame. No SDR-mode behaviour change. Files gfx/common/metal/metal_shader_types.h : +PaperWhiteNits, +CoreViewport. gfx/common/metal/Shaders.metal : hdr_composite_fragment rewritten for full-drawable coverage with CoreViewport clip; mode branches preserved. gfx/common/metal_common.h : resizeHDRResourcesForWidth is public. gfx/drivers/metal.m : SDR overlay ivars + alloc / free; HDR-aware rce getter; two-pass hdrComposite with clear-only fallback; BGRA8 pipelines; resize propagation; setHDRMenuNits wires PaperWhiteNits.
1 parent 6040eac commit 42f3a3c

4 files changed

Lines changed: 426 additions & 196 deletions

File tree

gfx/common/metal/Shaders.metal

Lines changed: 82 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -802,99 +802,132 @@ inline float4 hdr_sample_sdr_linear(texture2d<float> src,
802802

803803
/* Forward HDR composite.
804804
*
805-
* HDRMode == 3 : Source is PQ HDR10 (shader emitted), convert to scRGB.
806-
* HDRMode == 2 : scRGB output. Source is either SDR or FP16 HDR.
807-
* HDRMode == 1 : HDR10 output. Source is either SDR or already-PQ.
808-
* HDRMode == 0 : Passthrough (bypass path, rarely used since the composite
809-
* is skipped when HDR is disabled). */
805+
* Covers the FULL drawable with a quad. Inside the core-video viewport
806+
* rect (CoreViewport uniform), the source is HDR-encoded and shown;
807+
* outside the viewport rect, the fragment emits fully-transparent black
808+
* so the drawable's clear colour (or previous contents, if blending is
809+
* enabled upstream) is preserved.
810+
*
811+
* This shader is used in two modes by the driver:
812+
*
813+
* 1. Core composite pass. Source is the core video / shader chain
814+
* output. Blending disabled on the pipeline: the drawable just
815+
* got cleared, and we want the fragment to *replace* drawable
816+
* contents inside the core rect. Outside the core rect the
817+
* fragment emits alpha=0 and since blending is off we write
818+
* (0,0,0,0) which matches the clear colour — safe.
819+
*
820+
* Actually, cleaner: blending is OFF, so we'd clobber the clear
821+
* colour outside the rect. To avoid that, the driver uses the
822+
* viewport-scissor hint via the Scissor Metal API instead — or
823+
* we set alpha=0 and rely on the clear being (0,0,0,0) too. The
824+
* driver sets a scissor rect equal to CoreViewport for the core
825+
* pass so this fragment only runs inside the rect.
826+
*
827+
* 2. Menu composite pass. Source is the BGRA8 SDR overlay offscreen.
828+
* Blending ENABLED on the pipeline (SRC_ALPHA / ONE_MINUS_SRC_ALPHA).
829+
* The SDR shader path (HDRMode == 1 with InverseTonemap+HDR10, or
830+
* HDRMode == 2 without them, driven by the menu-specific uniforms
831+
* set by the driver) encodes the SDR overlay to the drawable's
832+
* HDR colour space with the overlay's alpha preserved, and Metal's
833+
* blending unit alpha-blends it over the already-written core.
834+
*
835+
* HDRMode values match the Vulkan reference:
836+
* 3 — source is shader-emitted PQ HDR10, swapchain is scRGB (convert)
837+
* 2 — swapchain is scRGB; source is SDR (inverse_tonemap+hdr10 off) or HDR16
838+
* 1 — swapchain is HDR10 PQ; source is SDR (inverse_tonemap+hdr10 on) or PQ
839+
*/
810840
fragment float4 hdr_composite_fragment(
811841
HDRVertexOut in [[ stage_in ]],
812842
constant HDRUniforms &u [[ buffer(0) ]],
813843
texture2d<float> src [[ texture(0) ]],
814844
sampler samp [[ sampler(0) ]])
815845
{
846+
/* Remap fragment pos in drawable pixel-space to source UV in [0..1]
847+
* across CoreViewport. Fragments outside the rect will sample
848+
* out-of-range — we also emit alpha=0 for those so the blend-enabled
849+
* menu pass doesn't contaminate outside the rect. */
850+
float2 frag_px = in.position.xy;
851+
float2 vp_origin = u.CoreViewport.xy;
852+
float2 vp_size = u.CoreViewport.zw;
853+
float2 core_uv = (frag_px - vp_origin) / vp_size;
854+
bool in_rect = all(core_uv >= float2(0.0f))
855+
&& all(core_uv <= float2(1.0f));
856+
857+
if (!in_rect)
858+
return float4(0.0f, 0.0f, 0.0f, 0.0f);
859+
816860
if (u.HDRMode == 3u)
817861
{
818862
/* Shader chain emitted PQ HDR10, swapchain is scRGB -> convert. */
819-
float4 pq = src.sample(samp, in.texCoord);
863+
float4 pq = src.sample(samp, core_uv);
820864
return float4(hdr::HDR10ToscRGB(pq.rgb), pq.a);
821865
}
822866

823867
if (u.HDRMode == 2u)
824868
{
825-
/* scRGB swapchain. Either HDR16 (shader emits linear float) or SDR.
826-
* For HDR16 we expect the shader output to already be in linear BT.709
827-
* 1.0 = 80 nits, so just pass through. For SDR, apply gamut rotation
828-
* + scale by BrightnessNits / 80. */
869+
/* scRGB swapchain. Shader-emitted HDR16 is already linear
870+
* BT.709 1.0=80 nits, pass through. For SDR, gamut-rotate and
871+
* scale to paper-white nits in scRGB units. */
829872
if (u.InverseTonemap <= 0.0f && u.HDR10 <= 0.0f)
830-
{
831-
/* Shader already emits scRGB-compatible linear HDR (HDR16 path). */
832-
float4 linear = src.sample(samp, in.texCoord);
833-
return linear;
834-
}
873+
return src.sample(samp, core_uv);
835874

836-
/* High-res SDR with scanlines requested: generate CRT mask in HDR.
837-
* Scanlines() returns linear Rec.709 already masked in Rec.709 space;
838-
* scRGB units are 1.0 = 80 nits so scale by BrightnessNits/80. */
839875
if (u.Scanlines > 0.0f && u.OutputSize.y > (240.0f * 4.0f))
840876
{
841-
float3 linear = hdr_crt::Scanlines(src, samp, in.texCoord, u);
842-
return float4(linear * (u.BrightnessNits / hdr::kscRGBWhiteNits), 1.0f);
877+
float3 linear = hdr_crt::Scanlines(src, samp, core_uv, u);
878+
return float4(linear * (u.BrightnessNits / hdr::kscRGBWhiteNits),
879+
1.0f);
843880
}
844881

845-
float4 linear = float4(hdr::To2020(
846-
hdr_sample_sdr_linear(src, samp, in.texCoord).rgb,
847-
u.ExpandGamut),
848-
1.0f);
849-
linear.rgb = hdr::k2020to709 * linear.rgb;
850-
linear.rgb *= u.BrightnessNits / hdr::kscRGBWhiteNits;
851-
return linear;
882+
float4 sdr_in = hdr_sample_sdr_linear(src, samp, core_uv);
883+
float3 rec2020 = hdr::To2020(sdr_in.rgb, u.ExpandGamut);
884+
float3 rec709 = hdr::k2020to709 * rec2020;
885+
float3 scrgb = rec709 * (u.BrightnessNits / hdr::kscRGBWhiteNits);
886+
return float4(scrgb, sdr_in.a);
852887
}
853888

854889
/* HDRMode == 1: HDR10 output. */
855890

856-
/* Shader already emitted PQ or FP16 HDR content -> pass through, the
857-
* shader's output is authoritative. We rely on set_hdr10()/set_hdr16()
858-
* clearing InverseTonemap + HDR10 in the driver wiring. */
891+
/* Shader-emitted PQ or FP16: pass through. */
859892
if (u.InverseTonemap <= 0.0f && u.HDR10 <= 0.0f)
860-
return src.sample(samp, in.texCoord);
893+
return src.sample(samp, core_uv);
861894

862-
/* SDR input, both inverse-tonemap and HDR10 encode requested. */
863895
if (u.InverseTonemap > 0.0f && u.HDR10 > 0.0f)
864896
{
865897
if (u.Scanlines > 0.0f && u.OutputSize.y > (240.0f * 4.0f))
866898
{
867899
/* Scanlines() returns linear Rec.2020 with inverse-tonemap + mask
868900
* baked in, so we only need the PQ encode. */
869-
float3 hdr_2020 = hdr_crt::Scanlines(src, samp, in.texCoord, u);
901+
float3 hdr_2020 = hdr_crt::Scanlines(src, samp, core_uv, u);
870902
float3 pq = hdr::HDR10Encode(hdr_2020, u.BrightnessNits);
871903
return float4(pq, 1.0f);
872904
}
873905

874-
float4 linear = hdr_sample_sdr_linear(src, samp, in.texCoord);
875-
float3 rec2020 = hdr::To2020(linear.rgb, u.ExpandGamut);
876-
float3 hdr2020 = hdr::InverseTonemap(rec2020,
906+
float4 sdr_in = hdr_sample_sdr_linear(src, samp, core_uv);
907+
float3 rec2020 = hdr::To2020(sdr_in.rgb, u.ExpandGamut);
908+
float3 hdr_2020 = hdr::InverseTonemap(rec2020,
877909
u.BrightnessNits,
878910
u.BrightnessNits);
879-
float3 pq = hdr::HDR10Encode(hdr2020, u.BrightnessNits);
880-
return float4(pq, linear.a);
911+
float3 pq = hdr::HDR10Encode(hdr_2020, u.BrightnessNits);
912+
return float4(pq, sdr_in.a);
881913
}
882914

915+
/* InverseTonemap alone (no PQ encode) — linear HDR output. */
883916
if (u.InverseTonemap > 0.0f)
884917
{
885-
float4 linear = hdr_sample_sdr_linear(src, samp, in.texCoord);
886-
float3 rec2020 = hdr::To2020(linear.rgb, u.ExpandGamut);
887-
float3 hdr2020 = hdr::InverseTonemap(rec2020,
888-
u.BrightnessNits,
889-
u.BrightnessNits);
890-
return float4(hdr2020, linear.a);
918+
float4 sdr_in = hdr_sample_sdr_linear(src, samp, core_uv);
919+
float3 rec2020 = hdr::To2020(sdr_in.rgb, u.ExpandGamut);
920+
float3 hdr_2020 = hdr::InverseTonemap(rec2020,
921+
u.BrightnessNits,
922+
u.BrightnessNits);
923+
return float4(hdr_2020, sdr_in.a);
891924
}
892925

893-
/* HDR10 only */
894-
float4 linear = hdr_sample_sdr_linear(src, samp, in.texCoord);
895-
float3 rec2020 = hdr::To2020(linear.rgb, u.ExpandGamut);
896-
float3 pq = hdr::HDR10Encode(rec2020, u.BrightnessNits);
897-
return float4(pq, linear.a);
926+
/* HDR10 (PQ encode only, no inverse tonemap). */
927+
float4 sdr_in = hdr_sample_sdr_linear(src, samp, core_uv);
928+
float3 rec2020 = hdr::To2020(sdr_in.rgb, u.ExpandGamut);
929+
float3 pq = hdr::HDR10Encode(rec2020, u.BrightnessNits);
930+
return float4(pq, sdr_in.a);
898931
}
899932

900933
/* HDR -> SDR screenshot/recording path.

gfx/common/metal/metal_shader_types.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -113,14 +113,15 @@ typedef struct
113113
matrix_float4x4 mvp;
114114
vector_float4 SourceSize; /* xy = size, zw = 1/size */
115115
vector_float4 OutputSize; /* xy = size, zw = 1/size */
116-
float BrightnessNits; /* paper-white in nits */
116+
vector_float4 CoreViewport; /* xy = origin, zw = size (pixels, drawable-space) */
117+
float BrightnessNits; /* core paper-white in nits */
117118
unsigned int SubpixelLayout; /* 0=RGB, 1=RBG, 2=BGR */
118119
float Scanlines; /* >0 enables CRT scanline/mask pass */
119120
unsigned int ExpandGamut; /* 0=accurate, 1=expanded709, 2=P3, 3=super */
120121
float InverseTonemap; /* >0 applies SDR->HDR inverse tonemap */
121122
float HDR10; /* >0 applies linear->PQ encode */
122123
unsigned int HDRMode; /* 0 off, 1 HDR10, 2 scRGB, 3 PQ->scRGB */
123-
float _pad0; /* keep 16-byte alignment */
124+
float PaperWhiteNits; /* UI paper-white for SDR overlay blend */
124125
} HDRUniforms;
125126

126127
#endif

gfx/common/metal_common.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -225,6 +225,12 @@ typedef NS_ENUM(NSUInteger, ViewportResetMode) {
225225
- (void)setHDRScanlines:(bool)scanlines;
226226
- (void)setHDRSubpixelLayout:(unsigned)layout;
227227

228+
/* (Re)allocate HDR-mode offscreen textures (readback landing pad and
229+
* SDR UI overlay) to match a new drawable size. Called from
230+
* setViewportWidth:height: on window resize; cheap no-op when the
231+
* current allocations already match. */
232+
- (void)resizeHDRResourcesForWidth:(NSUInteger)w height:(NSUInteger)h;
233+
228234
/* Shader-emitted HDR path: set by FrameView after parsing a shader preset,
229235
* tells the composite fragment to pass the final pass through without
230236
* inverse-tonemap / PQ encode. */

0 commit comments

Comments
 (0)