Skip to content

Commit f853239

Browse files
authored
feat(tauri): add embedded WebDriver server health check functionality (#238)
* feat(tauri): add embedded WebDriver server health check functionality - Introduced `checkEmbeddedServerAlive` function to verify the reachability of the embedded WebDriver server. - Enhanced `TauriLaunchService` with `ensureEmbeddedServersHealthy` method to monitor and restart embedded WebDriver instances if they become unreachable, particularly addressing issues on Windows. - Updated tests to cover the new health check functionality and ensure proper behavior during server status checks. * feat(tauri): enhance embedded WebDriver server health check with configurable timeout - Updated `checkEmbeddedServerAlive` to accept a customizable timeout parameter for the server status check, improving flexibility in various environments. - Modified `TauriLaunchService` to utilize the new timeout option from the configuration for health checks. - Added `statusPollTimeout` option in `TauriServiceOptions` to allow users to specify the timeout duration, addressing issues in slow CI environments. - Adjusted tests to reflect the changes in the health check function and ensure proper handling of the new timeout parameter. * feat(tauri): refactor embedded WebDriver health check and add stability probes for Windows - Refactored `ensureEmbeddedServersHealthy` method to utilize a new `restartEmbeddedServer` helper function for improved readability and maintainability. - Introduced `verifyEmbeddedServerStable` method to perform additional stability checks on Windows after a successful health check, reducing the risk of race conditions. - Updated tests to reflect changes in method names and added new test cases for the stability probes, ensuring comprehensive coverage of the new functionality. * feat(tauri): enhance mock update process with retry logic and error handling - Introduced a `sleep` function to facilitate retries for mock updates that fail on the first attempt. - Updated `updateAllMocks` to handle mock updates sequentially on Windows, preventing concurrent execution issues with WebView2. - Implemented detailed error handling to throw an `AggregateError` when mock updates fail, including the failing mock IDs in the error message. - Added comprehensive tests to validate the new retry logic and error handling for mock updates, ensuring robustness across different scenarios. * feat(tauri-plugin-webdriver): implement script execution locks for concurrent WebView2 calls on Windows - Added `ScriptExecutionLocks` to serialize concurrent `ExecuteScript` calls per webview, preventing potential completion handler drops and invalid states. - Updated `init_with_port` to manage the new `ScriptExecutionLocks` alongside existing async script state management. - Enhanced platform module exports to include `ScriptExecutionLocks` for better accessibility in the Windows environment. * fix(tauri): improve error logging during mock update retries - Enhanced the error handling in the `tryUpdate` function to log detailed debug information when a mock update fails, including the mock ID and the error encountered. - This change aims to facilitate better debugging and tracking of issues during the mock update process.
1 parent e122474 commit f853239

10 files changed

Lines changed: 650 additions & 85 deletions

File tree

packages/tauri-plugin-webdriver/src/lib.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ pub fn init_with_port<R: Runtime>(port: u16) -> TauriPlugin<R> {
5252
// Manage async script state for native message handlers (Windows only)
5353
#[cfg(target_os = "windows")]
5454
app.manage(platform::AsyncScriptState::default());
55+
// Serialize concurrent ExecuteScript calls per webview (Windows only)
56+
#[cfg(target_os = "windows")]
57+
app.manage(platform::ScriptExecutionLocks::default());
5558

5659
// Manage per-window alert state
5760
app.manage(platform::AlertStateManager::default());

packages/tauri-plugin-webdriver/src/platform/mod.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ pub use executor::*;
66

77
#[cfg(target_os = "windows")]
88
pub use windows::AsyncScriptState;
9+
#[cfg(target_os = "windows")]
10+
pub use windows::ScriptExecutionLocks;
911

1012
#[cfg(target_os = "macos")]
1113
mod macos;

packages/tauri-plugin-webdriver/src/platform/windows.rs

Lines changed: 98 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,25 @@ use crate::webdriver::Timeouts;
3434
/// Handler name used for postMessage calls
3535
const HANDLER_NAME: &str = "webdriver_async";
3636

37+
/// Serializes concurrent WebView2 ExecuteScript calls per webview window.
38+
/// On Windows, issuing multiple concurrent ExecuteScript calls against the same
39+
/// CoreWebView2 can cause completion handlers to be silently dropped or the
40+
/// webview to enter an invalid state, causing script timeouts or app crashes.
41+
/// A per-label tokio::sync::Mutex ensures only one script executes at a time.
42+
#[derive(Default)]
43+
pub struct ScriptExecutionLocks {
44+
locks: std::sync::Mutex<HashMap<String, Arc<tokio::sync::Mutex<()>>>>,
45+
}
46+
47+
impl ScriptExecutionLocks {
48+
pub fn get(&self, label: &str) -> Arc<tokio::sync::Mutex<()>> {
49+
let mut m = self.locks.lock().expect("ScriptExecutionLocks poisoned");
50+
m.entry(label.to_string())
51+
.or_insert_with(|| Arc::new(tokio::sync::Mutex::new(())))
52+
.clone()
53+
}
54+
}
55+
3756
/// Shared state for pending async script operations.
3857
/// This is managed via Tauri's state system (`app.manage()`).
3958
#[derive(Default)]
@@ -109,69 +128,15 @@ impl<R: Runtime> WindowsExecutor<R> {
109128
}
110129
}
111130

112-
/// Register `WebView2` handlers at webview creation time.
113-
/// This is called from the plugin's `on_webview_ready` hook to ensure
114-
/// the script dialog handler is registered before any navigation completes.
115-
pub fn register_webview_handlers<R: Runtime>(webview: &tauri::Webview<R>) {
116-
// Get per-window alert state from the manager
117-
let manager = webview.app_handle().state::<AlertStateManager>();
118-
let alert_state = manager.get_or_create(webview.label());
119-
120-
let _ = webview.with_webview(move |webview| unsafe {
121-
let _ = CoInitializeEx(None, COINIT_APARTMENTTHREADED);
122-
123-
if let Ok(webview2) = webview.controller().CoreWebView2() {
124-
// Disable default script dialogs so ScriptDialogOpening event fires
125-
if let Ok(settings) = webview2.Settings() {
126-
if let Err(e) = settings.SetAreDefaultScriptDialogsEnabled(false) {
127-
tracing::error!("Failed to disable default script dialogs: {e:?}");
128-
return;
129-
}
130-
} else {
131-
tracing::error!("Failed to get webview settings");
132-
return;
133-
}
134-
135-
let handler: ICoreWebView2ScriptDialogOpeningEventHandler =
136-
ScriptDialogOpeningHandler::new(alert_state).into();
137-
138-
let mut token = std::mem::zeroed();
139-
if let Err(e) = webview2.add_ScriptDialogOpening(&handler, &raw mut token) {
140-
tracing::error!("Failed to register ScriptDialogOpening handler: {e:?}");
141-
} else {
142-
tracing::debug!("Registered script dialog handler for webview");
143-
}
144-
145-
// Prevent handler from being dropped - leak it to keep the COM ref alive
146-
std::mem::forget(handler);
147-
}
148-
});
149-
}
150-
151-
#[async_trait]
152-
impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
153-
// =========================================================================
154-
// Window Access
155-
// =========================================================================
156-
157-
fn window(&self) -> &WebviewWindow<R> {
158-
&self.window
159-
}
160-
161-
fn script_timeout_ms(&self) -> u64 {
162-
self.timeouts.script_ms
163-
}
164-
165-
// =========================================================================
166-
// Core JavaScript Execution
167-
// =========================================================================
168-
169-
async fn evaluate_js(&self, script: &str) -> Result<Value, WebDriverErrorResponse> {
131+
impl<R: Runtime + 'static> WindowsExecutor<R> {
132+
/// Core WebView2 script execution — no per-webview lock.
133+
/// Callers that need serialization must acquire the lock from
134+
/// `ScriptExecutionLocks` before calling this method.
135+
async fn evaluate_js_inner(&self, script: &str) -> Result<Value, WebDriverErrorResponse> {
170136
let (tx, rx) = oneshot::channel();
171137
let script_preview: String = script.chars().take(100).collect();
172138
let script_owned = wrap_script_for_frame_context(script, &self.frame_context);
173139

174-
// Wrap tx in Arc<Mutex<Option<...>>> early so we can send errors from failure paths
175140
let tx = Arc::new(std::sync::Mutex::new(Some(tx)));
176141

177142
let result = self.window.with_webview({
@@ -182,13 +147,10 @@ impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
182147
if let Ok(webview2) = webview.controller().CoreWebView2() {
183148
let script_hstring = HSTRING::from(&script_owned);
184149

185-
// Clone tx for the handler - we keep a reference for error handling
186150
let handler: ICoreWebView2ExecuteScriptCompletedHandler =
187151
ExecuteScriptHandler::new(tx.clone()).into();
188152

189153
if let Err(e) = webview2.ExecuteScript(PCWSTR(script_hstring.as_ptr()), &handler) {
190-
// Failed to start script execution - send error through channel
191-
// The handler won't be called, so we need to send the error ourselves
192154
tracing::error!("ExecuteScript call failed for script '{}...': {e:?}", script_preview);
193155
if let Ok(mut guard) = tx.lock() {
194156
if let Some(tx) = guard.take() {
@@ -197,7 +159,6 @@ impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
197159
}
198160
}
199161
} else {
200-
// Failed to get CoreWebView2 - send error through channel
201162
tracing::error!("Failed to get CoreWebView2 for script execution");
202163
if let Ok(mut guard) = tx.lock() {
203164
if let Some(tx) = guard.take() {
@@ -209,7 +170,6 @@ impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
209170
});
210171

211172
if let Err(e) = result {
212-
// with_webview itself failed - try to send error if channel still open
213173
tracing::error!("with_webview failed: {e}");
214174
if let Ok(mut guard) = tx.lock() {
215175
if let Some(tx) = guard.take() {
@@ -232,6 +192,71 @@ impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
232192
Err(_) => Err(WebDriverErrorResponse::script_timeout()),
233193
}
234194
}
195+
}
196+
197+
/// Register `WebView2` handlers at webview creation time.
198+
/// This is called from the plugin's `on_webview_ready` hook to ensure
199+
/// the script dialog handler is registered before any navigation completes.
200+
pub fn register_webview_handlers<R: Runtime>(webview: &tauri::Webview<R>) {
201+
// Get per-window alert state from the manager
202+
let manager = webview.app_handle().state::<AlertStateManager>();
203+
let alert_state = manager.get_or_create(webview.label());
204+
205+
let _ = webview.with_webview(move |webview| unsafe {
206+
let _ = CoInitializeEx(None, COINIT_APARTMENTTHREADED);
207+
208+
if let Ok(webview2) = webview.controller().CoreWebView2() {
209+
// Disable default script dialogs so ScriptDialogOpening event fires
210+
if let Ok(settings) = webview2.Settings() {
211+
if let Err(e) = settings.SetAreDefaultScriptDialogsEnabled(false) {
212+
tracing::error!("Failed to disable default script dialogs: {e:?}");
213+
return;
214+
}
215+
} else {
216+
tracing::error!("Failed to get webview settings");
217+
return;
218+
}
219+
220+
let handler: ICoreWebView2ScriptDialogOpeningEventHandler =
221+
ScriptDialogOpeningHandler::new(alert_state).into();
222+
223+
let mut token = std::mem::zeroed();
224+
if let Err(e) = webview2.add_ScriptDialogOpening(&handler, &raw mut token) {
225+
tracing::error!("Failed to register ScriptDialogOpening handler: {e:?}");
226+
} else {
227+
tracing::debug!("Registered script dialog handler for webview");
228+
}
229+
230+
// Prevent handler from being dropped - leak it to keep the COM ref alive
231+
std::mem::forget(handler);
232+
}
233+
});
234+
}
235+
236+
#[async_trait]
237+
impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
238+
// =========================================================================
239+
// Window Access
240+
// =========================================================================
241+
242+
fn window(&self) -> &WebviewWindow<R> {
243+
&self.window
244+
}
245+
246+
fn script_timeout_ms(&self) -> u64 {
247+
self.timeouts.script_ms
248+
}
249+
250+
// =========================================================================
251+
// Core JavaScript Execution
252+
// =========================================================================
253+
254+
async fn evaluate_js(&self, script: &str) -> Result<Value, WebDriverErrorResponse> {
255+
let locks = self.window.state::<ScriptExecutionLocks>();
256+
let lock = locks.get(self.window.label());
257+
let _guard = lock.lock().await;
258+
self.evaluate_js_inner(script).await
259+
}
235260

236261
// =========================================================================
237262
// Screenshots
@@ -583,10 +608,16 @@ impl<R: Runtime + 'static> PlatformExecutor<R> for WindowsExecutor<R> {
583608
}})()"
584609
);
585610

586-
// Execute the wrapper (returns immediately)
587-
self.evaluate_js(&wrapper).await?;
611+
// Acquire the per-webview lock and hold it across the entire async script lifecycle.
612+
// This prevents another ExecuteScript from preempting the in-flight async JS callback.
613+
let locks = self.window.state::<ScriptExecutionLocks>();
614+
let lock = locks.get(self.window.label());
615+
let _guard = lock.lock().await;
616+
617+
// Execute the wrapper using the unlocked inner method (we already hold the lock).
618+
self.evaluate_js_inner(&wrapper).await?;
588619

589-
// Wait for result with timeout
620+
// Wait for result with timeout (lock is still held; released when _guard drops).
590621
let timeout_ms = self.timeouts.script_ms;
591622
let timeout = std::time::Duration::from_millis(timeout_ms);
592623

packages/tauri-service/src/embeddedProvider.ts

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,20 @@ export async function stopEmbeddedDriver(info: EmbeddedDriverInfo): Promise<void
205205
child.kill('SIGKILL');
206206
}
207207

208+
/**
209+
* Check if the embedded WebDriver server is reachable on the given port
210+
*/
211+
export async function checkEmbeddedServerAlive(port: number, timeoutMs: number = 2000): Promise<boolean> {
212+
try {
213+
const response = await fetch(`http://127.0.0.1:${port}/status`, {
214+
signal: AbortSignal.timeout(timeoutMs),
215+
});
216+
return response.ok;
217+
} catch {
218+
return false;
219+
}
220+
}
221+
208222
/**
209223
* Check if embedded provider should be used
210224
* Returns true when no driverProvider is configured (embedded is the default)

packages/tauri-service/src/launcher.ts

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import { ensureTauriDriver, findTestRunnerBackend } from './driverManager.js';
1313
import { DriverPool } from './driverPool.js';
1414
import { ensureMsEdgeDriver } from './edgeDriverManager.js';
1515
import {
16+
checkEmbeddedServerAlive,
1617
type EmbeddedDriverInfo,
1718
getEmbeddedPort,
1819
isEmbeddedProvider,
@@ -73,6 +74,8 @@ export default class TauriLaunchService {
7374
private backendPortManager: PortManager;
7475
private driverPool: DriverPool;
7576
private embeddedProcesses: Map<string, EmbeddedDriverInfo> = new Map();
77+
private embeddedConfigs: Map<string, { appBinaryPath: string; port: number; options: TauriServiceOptions }> =
78+
new Map();
7679
private isEmbeddedMode: boolean = false;
7780
private workerBackends: Map<string, { proc: ChildProcess; port: number }> = new Map();
7881
private cnCycleConfigs?: Array<{
@@ -310,6 +313,7 @@ export default class TauriLaunchService {
310313
try {
311314
const driverInfo = await startEmbeddedDriver(appBinaryPath, embeddedPort, instanceOptions, instanceId);
312315
this.embeddedProcesses.set(instanceId, driverInfo);
316+
this.embeddedConfigs.set(instanceId, { appBinaryPath, port: embeddedPort, options: instanceOptions });
313317
} catch (error) {
314318
throw new SevereServiceError(`Failed to start embedded WebDriver for ${key}: ${(error as Error).message}`);
315319
}
@@ -494,6 +498,7 @@ export default class TauriLaunchService {
494498
try {
495499
const driverInfo = await startEmbeddedDriver(appBinaryPath, embeddedPort, instanceOptions, String(i));
496500
this.embeddedProcesses.set(String(i), driverInfo);
501+
this.embeddedConfigs.set(String(i), { appBinaryPath, port: embeddedPort, options: instanceOptions });
497502
} catch (error) {
498503
throw new SevereServiceError(
499504
`Failed to start embedded WebDriver for instance ${i}: ${(error as Error).message}`,
@@ -714,12 +719,90 @@ export default class TauriLaunchService {
714719
}
715720
}
716721

722+
// For embedded mode (non-per-worker), check if the app is still alive and restart if needed.
723+
// On Windows the embedded Tauri process can crash after a WebDriver session is deleted,
724+
// causing ECONNREFUSED for all subsequent spec files.
725+
if (this.isEmbeddedMode && !this.perWorkerMode) {
726+
await this.ensureEmbeddedServersHealthy();
727+
}
728+
717729
// Run environment diagnostics
718730
await this.diagnoseEnvironment(this.appBinaryPath);
719731

720732
log.debug(`Tauri worker session started: ${cid}`);
721733
}
722734

735+
/**
736+
* Health-check all embedded WebDriver servers and restart any that have crashed.
737+
* Required on Windows where the Tauri process can die after session deletion,
738+
* making all subsequent spec files fail with ECONNREFUSED.
739+
*
740+
* On Windows, additional stability probes run after a passing check to close the
741+
* race window between the health check and WDIO's POST /session.
742+
*/
743+
private async ensureEmbeddedServersHealthy(): Promise<void> {
744+
for (const [instanceId, config] of this.embeddedConfigs) {
745+
const isAlive = await checkEmbeddedServerAlive(config.port, config.options.statusPollTimeout);
746+
if (!isAlive) {
747+
await this.restartEmbeddedServer(instanceId, config);
748+
} else {
749+
log.debug(`Embedded WebDriver on port ${config.port} (instance: ${instanceId}) is healthy`);
750+
if (process.platform === 'win32') {
751+
await this.verifyEmbeddedServerStable(instanceId, config);
752+
}
753+
}
754+
}
755+
}
756+
757+
private async restartEmbeddedServer(
758+
instanceId: string,
759+
config: { appBinaryPath: string; port: number; options: TauriServiceOptions },
760+
): Promise<void> {
761+
log.warn(`Embedded WebDriver on port ${config.port} (instance: ${instanceId}) is unreachable — restarting...`);
762+
const existing = this.embeddedProcesses.get(instanceId);
763+
if (existing) {
764+
try {
765+
await stopEmbeddedDriver(existing);
766+
} catch {
767+
// Process may already be dead; ignore
768+
}
769+
}
770+
try {
771+
const newInfo = await startEmbeddedDriver(config.appBinaryPath, config.port, config.options, instanceId);
772+
this.embeddedProcesses.set(instanceId, newInfo);
773+
log.info(`✅ Embedded WebDriver restarted on port ${config.port} (instance: ${instanceId})`);
774+
} catch (error) {
775+
throw new SevereServiceError(
776+
`Failed to restart embedded WebDriver on port ${config.port}: ${(error as Error).message}`,
777+
);
778+
}
779+
}
780+
781+
/**
782+
* After a passing health check on Windows, re-probe several times at short intervals
783+
* to catch a server that is about to crash. Closes the race window between the
784+
* initial probe and WDIO's first POST /session request.
785+
*/
786+
private async verifyEmbeddedServerStable(
787+
instanceId: string,
788+
config: { appBinaryPath: string; port: number; options: TauriServiceOptions },
789+
): Promise<void> {
790+
const PROBE_COUNT = 3;
791+
const PROBE_INTERVAL_MS = 500;
792+
for (let i = 0; i < PROBE_COUNT; i++) {
793+
await new Promise<void>((resolve) => setTimeout(resolve, PROBE_INTERVAL_MS));
794+
const isAlive = await checkEmbeddedServerAlive(config.port, config.options.statusPollTimeout);
795+
if (!isAlive) {
796+
log.warn(
797+
`Embedded WebDriver on port ${config.port} died during stability check (probe ${i + 1}/${PROBE_COUNT}), restarting...`,
798+
);
799+
await this.restartEmbeddedServer(instanceId, config);
800+
return;
801+
}
802+
}
803+
log.debug(`Embedded WebDriver on port ${config.port} passed ${PROBE_COUNT} stability probes`);
804+
}
805+
723806
/**
724807
* Diagnose the environment before running tests
725808
*/
@@ -901,6 +984,7 @@ export default class TauriLaunchService {
901984

902985
await this.driverPool.stopAll();
903986
this.instanceOptions.clear();
987+
this.embeddedConfigs.clear();
904988
this.portManager.clear();
905989
this.backendPortManager.clear();
906990

0 commit comments

Comments
 (0)