Skip to content

fix: make Rive data binding thread-safe on iOS & Android (#297)#298

Draft
mfazekas wants to merge 18 commits into
mainfrom
fix/issue-297-databinding-thread-safety
Draft

fix: make Rive data binding thread-safe on iOS & Android (#297)#298
mfazekas wants to merge 18 commits into
mainfrom
fix/issue-297-databinding-thread-safety

Conversation

@mfazekas

@mfazekas mfazekas commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes the occasional crashes in #297. The Rive data-binding objects aren't thread-safe and must be driven from a single thread, but the Nitro layer accessed them from several threads with no synchronization:

  • iOS — the Rive SDK plus the Obj-C RiveDataBindingViewModelInstance / C++ ViewModelInstanceRuntime caches are main-thread-only, yet they were touched from the JS thread (sync accessors), the Swift cooperative pool (every Promise.async), and the main render loop. Concurrent access corrupts those caches → the three crash stacks in the issue.
  • AndroidBaseHybridViewModelPropertyImpl iterated its listener map on a Dispatchers.Default flow collector while addListener/removeListener mutated it from the JS thread → ConcurrentModificationException (this is what crashed the test-harness-android CI job).

The fix

iOS — route all data-binding access through the main thread:

  • MainThread.run — runs inline on main, else hops via DispatchQueue.main.sync.
  • Promise.onMain — runs the work on the main thread and resolves the promise right there on the calling thread, instead of resolving it later from a background thread. Nitro's Promise has an unlocked listener list; resolving it on the spot (before Nitro attaches .then) means no background resolve can race that .then, which clears the Nitro Promise-layer races too, not just the Rive ones.
  • Applied to HybridViewModelInstance, HybridViewModel, every HybridViewModel*Property, the property listener helper, and the deprecated imperative RiveView input/text-run methods (set/getNumberInputValue, set/getBooleanInputValue, triggerInput, set/getTextRunValue).

AndroidBaseHybridViewModelPropertyImpl now uses a ConcurrentHashMap and iterates a toList() snapshot in onChanged, so dispatch is safe even if a callback adds/removes a listener mid-iteration.

Verification (Thread Sanitizer)

Built the iOS example with -enableThreadSanitizer YES and ran a concurrent data-binding stress (viewModelAsync + synchronous property access on the same instance). TSan reports data races on the unfixed build and none after the fix; under heavy load the unfixed build crashes, the fixed one doesn't.

On Android, the harness test hits the ConcurrentModificationException crash before the fix and passes after.

Tests

react-native-harness tests — they always pass, the sanitizer is the real assertion (unfixed runtime → race reported / crash; fixed → clean):

  • example/src/__tests__/issue297.databinding.test.tsviewModelAsync (off-main) concurrent with synchronous property access. Raise CONCURRENCY/ROUNDS to reproduce the hard crash (what took down the Android CI job).
  • example/src/__tests__/issue297.hooks.test.tsx — the same bug via the public use* hooks: a live RiveView advances the state machine on the render thread while useRiveNumber churns the property (writes + addListener) on the JS thread.

example/src/reproducers/Issue297ThreadRace.tsx is a manual in-app reproducer.

Running the harness tests under Thread Sanitizer (iOS)

TSan is a compile flag, and the harness only drives a prebuilt app (it has no build step), so it can't be a harness option — it's the normal build plus one xcodebuild flag. yarn ios builds and installs the app, then the harness drives it:

cd example
yarn ios --no-packager --extra-params "-enableThreadSanitizer YES"
yarn test:harness:ios --testPathPattern issue297

--no-packager stops run-ios from starting Metro (the harness starts its own). The harness auto-targets the booted simulator (see rn-harness.config.mjs). TSan reports land in /tmp/tsan_harness.<pid>; grep for WARNING: ThreadSanitizer: data race — present on the unfixed runtime, absent after the fix.

Prefer an explicit build + install (what CI does)? Note build:ios needs --buildFolder to land in ios/build:

yarn build:ios --buildFolder build --extra-params "-enableThreadSanitizer YES"
xcrun simctl install booted ios/build/Build/Products/Debug-iphonesimulator/RiveExample.app
yarn test:harness:ios --testPathPattern issue297

Or enable Thread Sanitizer in the Xcode scheme (Run → Diagnostics) and ⌘R once to build+install.

@mfazekas mfazekas changed the title fix(ios): make data binding thread-safe (#297) fix: make Rive data binding thread-safe on iOS & Android (#297) Jun 19, 2026
mfazekas added 17 commits June 19, 2026 09:29
The Rive iOS runtime (RiveViewModel/RiveView, the Obj-C
RiveDataBindingViewModelInstance caches, and the C++ ViewModelInstanceRuntime
maps) is not thread-safe and must be driven from the main thread, where the
state machine advances and property listeners fire.

The Nitro layer accessed it from three threads with no synchronization:
- main render thread (state machine advance / afterUpdate reconfigure)
- JS thread (synchronous property accessors)
- Swift cooperative pool (every `Promise.async` block)

Concurrent access corrupts those caches, producing the crashes in #297
(NSMutableDictionary mutation, unordered_map free-of-unallocated, artboard
instancing). Thread Sanitizer flagged ~25 data races at the data-binding
boundary.

Route all data-binding access through the main thread:
- add `MainThread.run` (runs inline on main, hops via `DispatchQueue.main.sync`
  otherwise) and `Promise.onMain` (does the work on main and returns an
  already-resolved Promise, so Nitro's `.then` never races a background resolve)
- apply to HybridViewModelInstance, HybridViewModel, every
  HybridViewModel*Property, and the property listener helper

After: TSan reports 0 races under the same stress; data binding still renders.
- example/src/__tests__/issue297.databinding.test.ts: on-device harness test
  that drives `viewModelAsync` (resolved off-main) concurrently with synchronous
  property access on the same ViewModelInstance. It stays green; built with
  Thread Sanitizer it reports the data race (unfixed) or is clean (fixed). Turn
  CONCURRENCY/ROUNDS up to reproduce the hard crash.
- rn-harness.config.mjs: HARNESS_TSAN-gated TSAN_OPTIONS via appLaunchOptions,
  and HARNESS_METRO_PORT override (defaults unchanged).
- Issue297ThreadRace.tsx: manual reproducer screen (same stress in-app).
- gitignore the .harness/ build cache.
Renders a live RiveView (state machine on the render thread) and churns the same
property from the JS thread through useRiveNumber (writes + addListener via remounts).
Under a TSan build it flags the listener race on the unfixed runtime, clean on the
fixed one.
…ad (#297)

setNumberInputValue/getNumberInputValue, setBooleanInputValue/getBooleanInputValue,
triggerInput, and setTextRunValue/getTextRunValue touch the artboard/state machine
from the JS thread. Funnel them through MainThread.run, same as the data-binding API.
…modification (#297)

onChanged() iterates the listener map on a Dispatchers.Default flow collector while
addListener/removeListener mutate it from the JS/native thread, throwing
ConcurrentModificationException and crashing the app (the Android analogue of the
data-binding race fixed on iOS).

Use ConcurrentHashMap and iterate over a toList() snapshot so dispatch is safe even
if a callback adds/removes a listener mid-iteration.
…under TSan in one step

The harness only drives a prebuilt app, so Thread Sanitizer (a compile flag) must be
baked into the app rather than passed to the harness. This script builds the example
with -enableThreadSanitizer YES, installs it on the booted simulator, and runs the
harness against it.
TSAN_OPTIONS is only read by the TSan runtime and ignored on a non-TSan build, so the
HARNESS_TSAN gate was pointless. Always set it; running under TSan is now just
'yarn test:harness:ios' once the app is built with -enableThreadSanitizer YES.
Auto-detect the booted simulator in rn-harness.config.mjs (override with
DEVICE_MODEL / IOS_VERSION; falls back to the previous fixed default). Now
'yarn ios' + 'yarn test:harness:ios' work without naming a device. Drop the now-
redundant device detection from the TSan script.
Booted-sim auto-detection is ambiguous with more than one booted device (it and
'simctl install booted' can disagree). Collect all booted sims and warn which one
is used, pointing at DEVICE_MODEL / IOS_VERSION to pin it.
The 3-step TSan flow (build:ios --extra-params, simctl install booted,
test:harness:ios) is simple enough and the harness now auto-targets the booted
simulator, so the wrapper script just duplicated it (via a different build path).
@mfazekas mfazekas force-pushed the fix/issue-297-databinding-thread-safety branch from 92cbb5d to 69526cd Compare June 19, 2026 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant