You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website_and_docs/content/blog/2026/selenium-grid-4-41-deep-dive.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -309,7 +309,7 @@ Previously, the Video container used a timer-based heuristic to decide when to s
309
309
310
310
### After: event-driven recording
311
311
312
-
With the new [`Video/video_service.py`](https://github.com/SeleniumHQ/docker-selenium/blob/4.41.0-20260222/Video/video_service.py) (865 lines), the video container subscribes to the Grid's **ZeroMQ event bus** directly. When a `session-created` event arrives, recording starts. When `session-closed` fires, recording stops and the uploader kicks in.
312
+
With the new [`Video/video_service.py`](https://github.com/SeleniumHQ/docker-selenium/blob/4.41.0-20260222/Video/video_service.py), the video container subscribes to the Grid's **ZeroMQ event bus** directly. When a `session-created` event arrives, recording starts. When `session-closed` fires, recording stops and the uploader kicks in.
-**Better observability** — native Prometheus metrics endpoint out of the box
378
378
379
-
The PR includes a comprehensive [`MIGRATION_INGRESS_NGINX_TO_TRAEFIK.md`](https://github.com/SeleniumHQ/docker-selenium/blob/4.41.0-20260222/charts/selenium-grid/MIGRATION_INGRESS_NGINX_TO_TRAEFIK.md) guide (180 lines) covering all common migration scenarios.
379
+
The PR includes a comprehensive [`MIGRATION_INGRESS_NGINX_TO_TRAEFIK.md`](https://github.com/SeleniumHQ/docker-selenium/blob/4.41.0-20260222/charts/selenium-grid/MIGRATION_INGRESS_NGINX_TO_TRAEFIK.md) guide covering all common migration scenarios.
380
380
381
381
### Migration in brief
382
382
@@ -407,21 +407,21 @@ Three significant Distributor fixes ship in 4.41.0, addressing real concurrency
407
407
408
408
The `LocalDistributor` runs periodic health checks against all registered nodes. Under high node counts, a subtle bug caused the health-check executor thread pool to accumulate tasks faster than they were being consumed, eventually exhausting threads and causing new session requests to stall.
409
409
410
-
The fix refactors [`LocalNodeRegistry.java`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/distributor/local/LocalNodeRegistry.java) to decouple the health-check scheduling from the main distribution path, with 295 lines of new tests specifically exercising concurrent health-check scenarios.
410
+
The fix refactors [`LocalNodeRegistry.java`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/distributor/local/LocalNodeRegistry.java) to decouple the health-check scheduling from the main distribution path, with new tests specifically exercising concurrent health-check scenarios.
[`ProxyNodeWebsockets`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/node/ProxyNodeWebsockets.java) tracks active WebSocket connections per node to respect `maxSessions`. A race condition in the connection bookkeeping could cause the counter to drift upward, making slots appear occupied when they were free. Over time this would cause nodes to appear artificially full.
415
415
416
-
The fix tightens the lifecycle management with try-finally guards around counter decrements, backed by 406 lines of dedicated unit tests.
416
+
The fix tightens the lifecycle management with try-finally guards around counter decrements.
417
417
418
418
**Fix 3: Retry session on executor shutdown ([selenium#17109](https://github.com/SeleniumHQ/selenium/pull/17109), commit [`527a40b`](https://github.com/SeleniumHQ/selenium/commit/527a40b30f01b272c1b7df1de122f8b16e3f79ce))**
419
419
420
420
When a [`RemoteNode`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/node/remote/RemoteNode.java)'s thread executor enters a shutdown state (e.g., during a graceful drain), session creation requests could be silently dropped instead of being returned to the queue for redistribution. The Distributor now detects `RejectedExecutionException` from a shutting-down executor and transparently retries the session on another available node.
[`LocalGridModel`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/distributor/local/LocalGridModel.java) contained a lock inversion risk between its internal state lock and the event bus listener lock. Under specific timing conditions this could deadlock the Distributor entirely. The fix restructures lock acquisition order with a dedicated test ([`LocalGridModelDeadlockTest.java`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/test/org/openqa/selenium/grid/distributor/local/LocalGridModelDeadlockTest.java), 275 lines) that explicitly reproduces the hazard.
424
+
[`LocalGridModel`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/distributor/local/LocalGridModel.java) contained a lock inversion risk between its internal state lock and the event bus listener lock. Under specific timing conditions this could deadlock the Distributor entirely. The fix restructures lock acquisition order with a dedicated test ([`LocalGridModelDeadlockTest.java`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/test/org/openqa/selenium/grid/distributor/local/LocalGridModelDeadlockTest.java)) that explicitly reproduces the hazard.
425
425
426
426
These four fixes together make the Distributor measurably more robust at scale. If you have ever seen mysterious session queue stalls or nodes appearing full when they shouldn't be, upgrading to 4.41.0 is strongly recommended.
427
427
@@ -468,7 +468,7 @@ When Grid Standalone is secured with HTTP basic auth, [`DockerSessionFactory`](h
468
468
469
469
**Restore stereotype capability merging in RelaySessionFactory ([selenium#17097](https://github.com/SeleniumHQ/selenium/pull/17097), commit [`ac74b7e`](https://github.com/SeleniumHQ/selenium/commit/ac74b7e263e1308c3f5f8c666c8c2cb97da3e417))**
470
470
471
-
A regression introduced in a prior release caused [`RelaySessionFactory`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/node/relay/RelaySessionFactory.java) to ignore stereotype capabilities during session creation, which broke mobile relay sessions that relied on custom capabilities from the stereotype being merged into the session request. Restored and covered by 156 lines of new tests.
471
+
A regression introduced in a prior release caused [`RelaySessionFactory`](https://github.com/SeleniumHQ/selenium/blob/selenium-4.41.0/java/src/org/openqa/selenium/grid/node/relay/RelaySessionFactory.java) to ignore stereotype capabilities during session creation, which broke mobile relay sessions that relied on custom capabilities from the stereotype being merged into the session request.
472
472
473
473
**Kubernetes: structured logs support ([docker-selenium#3087](https://github.com/SeleniumHQ/docker-selenium/pull/3087), commit [`ccd697c`](https://github.com/SeleniumHQ/docker-selenium/commit/ccd697cef2b13904c628b5d968447df1e7c30ed4))**
0 commit comments