Skip to content

Commit 222e9c3

Browse files
authored
Merge pull request #36 from webdriverio/feature/session-recording
feat: Implement session step recording, history lifecycle - Using `Resources` for session information - Code and JSON Steps separately - Use `isError` for faulty tool calling
2 parents 204c5b8 + ad7ff03 commit 222e9c3

33 files changed

Lines changed: 1008 additions & 29 deletions

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# Changelog
22

3+
## [2.4.1](https://github.com/webdriverio/mcp/compare/v2.4.0...v2.4.1) (2026-03-17)
4+
5+
### Features
6+
7+
* implement session step recording, history lifecycle, and WebdriverIO code generation ([6c41763](https://github.com/webdriverio/mcp/commit/6c417632fe05414659119e8531bab7394d81318d))
8+
9+
### Bug Fixes
10+
11+
* Use browser.$() for element instead of $() ([630201b](https://github.com/webdriverio/mcp/commit/630201b140b3f6d399ef452796c3a71feca8011a))
12+
313
## [2.4.0](https://github.com/webdriverio/mcp/compare/v2.3.1...v2.4.0) (2026-03-17)
414

515
### Features

CLAUDE.md

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,18 @@ npm start # Run built server from lib/server.js
1515

1616
```
1717
src/
18-
├── server.ts # MCP server entry, registers all tools
18+
├── server.ts # MCP server entry, registers all tools + MCP resources
1919
├── tools/
2020
│ ├── browser.tool.ts # Session state + start_browser + getBrowser()
2121
│ ├── app-session.tool.ts # start_app_session (iOS/Android via Appium)
2222
│ ├── navigate.tool.ts # URL navigation
2323
│ ├── get-visible-elements.tool.ts # Element detection (web + mobile)
24-
│ ├── click-element.tool.ts # Click/tap actions
24+
│ ├── click.tool.ts # Click/tap actions
2525
│ └── ... # Other tools follow same pattern
26+
├── recording/
27+
│ ├── step-recorder.ts # withRecording HOF, appendStep, session history access
28+
│ ├── code-generator.ts # SessionHistory → WebdriverIO JS code
29+
│ └── resources.ts # MCP resource builders (sessions index, step log)
2630
├── scripts/
2731
│ └── get-interactable-browser-elements.ts # Browser-context script
2832
├── locators/
@@ -32,7 +36,8 @@ src/
3236
├── config/
3337
│ └── appium.config.ts # iOS/Android capability builders
3438
└── types/
35-
└── tool.ts # ToolDefinition interface
39+
├── tool.ts # ToolDefinition interface
40+
└── recording.ts # RecordedStep, SessionHistory interfaces
3641
```
3742

3843
### Session State
@@ -80,6 +85,14 @@ export const myTool: ToolCallback = async ({ param }: { param: string }) => {
8085
server.tool(myToolDefinition.name, myToolDefinition.description, myToolDefinition.inputSchema, myTool);
8186
```
8287

88+
### Recording
89+
90+
All tools are wrapped with `withRecording()` in `server.ts`. Steps accumulate in `state.sessionHistory` (keyed by sessionId).
91+
MCP resources expose history without tool calls:
92+
- `wdio://sessions` — index of all sessions (fixed URI, discoverable via ListResources)
93+
- `wdio://session/current/steps` — current session step log + generated JS (fixed URI)
94+
- `wdio://session/{sessionId}/steps` — any session by ID (URI template, NOT listed by ListResources — see `docs/architecture/mcp-resources-notes.md`)
95+
8396
### Build
8497

8598
- **tsup** bundles `src/server.ts``lib/server.js` (ESM)
@@ -95,6 +108,9 @@ server.tool(myToolDefinition.name, myToolDefinition.description, myToolDefinitio
95108
| `src/tools/app-session.tool.ts` | Appium session creation |
96109
| `src/scripts/get-interactable-browser-elements.ts` | Browser-context element detection |
97110
| `src/locators/` | Mobile element detection + locator generation |
111+
| `src/recording/step-recorder.ts` | `withRecording(toolName, cb)` HOF — wraps every tool for step logging |
112+
| `src/recording/code-generator.ts` | Generates runnable WebdriverIO JS from `SessionHistory` |
113+
| `src/recording/resources.ts` | Builds text for `wdio://sessions` and `wdio://session/*/steps` resources |
98114
| `tsup.config.ts` | Build configuration |
99115

100116
## Gotchas

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ appium
8585
- **Scrolling**: Smooth scrolling with configurable distances
8686
- **Attach to running Chrome**: Connect to an existing Chrome window via `--remote-debugging-port` — ideal for testing authenticated or pre-configured sessions
8787
- **Device emulation**: Apply mobile/tablet presets (iPhone 15, Pixel 7, etc.) to simulate responsive layouts without a physical device
88+
- **Session Recording**: All tool calls are automatically recorded and exportable as runnable WebdriverIO JS
8889

8990
### Mobile App Automation (iOS/Android)
9091

@@ -458,6 +459,16 @@ This eliminates the need to manually handle permission popups during automated t
458459
- **Data Format:** TOON (Token-Oriented Object Notation) for efficient LLM communication
459460
- **Element Detection:** XML-based page source parsing with intelligent filtering and multi-strategy locator generation
460461

462+
### Session Recording & Code Export
463+
464+
Every tool call is automatically recorded to a session history. You can inspect sessions and export runnable code via MCP resources — no extra tool calls needed:
465+
466+
- `wdio://sessions` — lists all recorded sessions with type, timestamps, and step count
467+
- `wdio://session/current/steps` — step log for the active session, plus a generated WebdriverIO JS script ready to run with `webdriverio`
468+
- `wdio://session/{sessionId}/steps` — same for any past session by ID
469+
470+
The generated script reconstructs the full session — including capabilities, navigation, clicks, and inputs — as a standalone `import { remote } from 'webdriverio'` file.
471+
461472
## Troubleshooting
462473

463474
**Browser automation not working?**

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"type": "git",
66
"url": "git://github.com/webdriverio/mcp.git"
77
},
8-
"version": "2.4.0",
8+
"version": "2.4.1",
99
"description": "MCP server with WebdriverIO for browser and mobile app automation (iOS/Android via Appium)",
1010
"main": "./lib/server.js",
1111
"module": "./lib/server.js",

src/config/appium.config.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ export function buildIOSCapabilities(
9797
// Add any additional custom options
9898
for (const [key, value] of Object.entries(options)) {
9999
if (
100-
!['deviceName', 'platformVersion', 'automationName', 'autoAcceptAlerts', 'autoDismissAlerts', 'udid', 'noReset', 'fullReset', 'newCommandTimeout'].includes(
100+
!['deviceName', 'platformVersion', 'automationName', 'autoGrantPermissions', 'autoAcceptAlerts', 'autoDismissAlerts', 'udid', 'noReset', 'fullReset', 'newCommandTimeout'].includes(
101101
key,
102102
)
103103
) {
@@ -156,7 +156,7 @@ export function buildAndroidCapabilities(
156156
// Add any additional custom options
157157
for (const [key, value] of Object.entries(options)) {
158158
if (
159-
!['deviceName', 'platformVersion', 'automationName', 'autoGrantPermissions', 'appWaitActivity', 'noReset', 'fullReset', 'newCommandTimeout'].includes(
159+
!['deviceName', 'platformVersion', 'automationName', 'autoGrantPermissions', 'autoAcceptAlerts', 'autoDismissAlerts', 'appWaitActivity', 'noReset', 'fullReset', 'newCommandTimeout'].includes(
160160
key,
161161
)
162162
) {

src/recording/code-generator.ts

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
// src/recording/code-generator.ts
2+
import type { RecordedStep, SessionHistory } from '../types/recording';
3+
4+
/** Escape single quotes so generated JS string literals are valid. */
5+
function escapeStr(value: unknown): string {
6+
return String(value).replace(/\\/g, '\\\\').replace(/'/g, "\\'");
7+
}
8+
9+
function formatParams(params: Record<string, unknown>): string {
10+
return Object.entries(params)
11+
.map(([k, v]) => `${k}="${v}"`)
12+
.join(' ');
13+
}
14+
15+
function indentJson(value: unknown): string {
16+
return JSON.stringify(value, null, 2)
17+
.split('\n')
18+
.map((line, i) => (i > 0 ? ` ${line}` : line))
19+
.join('\n');
20+
}
21+
22+
function generateStep(step: RecordedStep, history: SessionHistory): string {
23+
if (step.tool === '__session_transition__') {
24+
const newId = (step.params.newSessionId as string) ?? 'unknown';
25+
return `// --- new session: ${newId} started at ${step.timestamp} ---`;
26+
}
27+
28+
if (step.status === 'error') {
29+
return `// [error] ${step.tool}: ${formatParams(step.params)}${step.error ?? 'unknown error'}`;
30+
}
31+
32+
const p = step.params;
33+
switch (step.tool) {
34+
case 'start_browser': {
35+
const nav = p.navigationUrl ? `\nawait browser.url('${escapeStr(p.navigationUrl)}');` : '';
36+
return `const browser = await remote({\n capabilities: ${indentJson(history.capabilities)}\n});${nav}`;
37+
}
38+
case 'start_app_session': {
39+
const config: Record<string, unknown> = {
40+
protocol: 'http',
41+
hostname: history.appiumConfig?.hostname ?? 'localhost',
42+
port: history.appiumConfig?.port ?? 4723,
43+
path: history.appiumConfig?.path ?? '/',
44+
capabilities: history.capabilities,
45+
};
46+
return `const browser = await remote(${indentJson(config)});`;
47+
}
48+
case 'attach_browser': {
49+
const nav = p.navigationUrl ? `\nawait browser.url('${escapeStr(p.navigationUrl)}');` : '';
50+
return `const browser = await remote({\n capabilities: ${indentJson(history.capabilities)}\n});${nav}`;
51+
}
52+
case 'navigate':
53+
return `await browser.url('${escapeStr(p.url)}');`;
54+
case 'click_element':
55+
return `await browser.$('${escapeStr(p.selector)}').click();`;
56+
case 'set_value':
57+
return `await browser.$('${escapeStr(p.selector)}').setValue('${escapeStr(p.value)}');`;
58+
case 'scroll': {
59+
const scrollAmount = (p.direction as string) === 'down' ? (p.pixels as number) : -(p.pixels as number);
60+
return `await browser.execute(() => window.scrollBy(0, ${scrollAmount}));`;
61+
}
62+
case 'tap_element':
63+
if (p.selector !== undefined) {
64+
return `await browser.$('${escapeStr(p.selector)}').click();`;
65+
}
66+
return `await browser.tap({ x: ${p.x}, y: ${p.y} });`;
67+
case 'swipe':
68+
return `await browser.execute('mobile: swipe', { direction: '${escapeStr(p.direction)}' });`;
69+
case 'drag_and_drop':
70+
if (p.targetSelector !== undefined) {
71+
return `await browser.$('${escapeStr(p.sourceSelector)}').dragAndDrop(browser.$('${escapeStr(p.targetSelector)}'));`;
72+
}
73+
return `await browser.$('${escapeStr(p.sourceSelector)}').dragAndDrop({ x: ${p.x}, y: ${p.y} });`;
74+
default:
75+
return `// [unknown tool] ${step.tool}`;
76+
}
77+
}
78+
79+
export function generateCode(history: SessionHistory): string {
80+
const steps = history.steps.map(step => generateStep(step, history)).join('\n');
81+
return `import { remote } from 'webdriverio';\n\n${steps}\n\nawait browser.deleteSession();`;
82+
}

src/recording/resources.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
// src/recording/resources.ts
2+
import type { SessionHistory } from '../types/recording';
3+
import { generateCode } from './code-generator';
4+
import { getSessionHistory } from './step-recorder';
5+
import { getBrowser } from '../tools/browser.tool';
6+
7+
function getCurrentSessionId(): string | null {
8+
return (getBrowser as any).__state?.currentSession ?? null;
9+
}
10+
11+
export interface SessionStepsPayload {
12+
stepsJson: string;
13+
generatedJs: string;
14+
}
15+
16+
export function buildSessionsIndex(): string {
17+
const histories = getSessionHistory();
18+
const currentId = getCurrentSessionId();
19+
const sessions = Array.from(histories.values()).map((h) => ({
20+
sessionId: h.sessionId,
21+
type: h.type,
22+
startedAt: h.startedAt,
23+
...(h.endedAt ? { endedAt: h.endedAt } : {}),
24+
stepCount: h.steps.length,
25+
isCurrent: h.sessionId === currentId,
26+
}));
27+
return JSON.stringify({ sessions });
28+
}
29+
30+
export function buildCurrentSessionSteps(): SessionStepsPayload | null {
31+
const currentId = getCurrentSessionId();
32+
if (!currentId) return null;
33+
34+
return buildSessionStepsById(currentId);
35+
}
36+
37+
export function buildSessionStepsById(sessionId: string): SessionStepsPayload | null {
38+
const history = getSessionHistory().get(sessionId);
39+
if (!history) return null;
40+
41+
return buildSessionPayload(history);
42+
}
43+
44+
function buildSessionPayload(history: SessionHistory): SessionStepsPayload {
45+
const stepsJson = JSON.stringify({
46+
sessionId: history.sessionId,
47+
type: history.type,
48+
startedAt: history.startedAt,
49+
...(history.endedAt ? { endedAt: history.endedAt } : {}),
50+
stepCount: history.steps.length,
51+
steps: history.steps,
52+
});
53+
54+
return { stepsJson, generatedJs: generateCode(history) };
55+
}

src/recording/step-recorder.ts

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
// src/recording/step-recorder.ts
2+
import type { ToolCallback } from '@modelcontextprotocol/sdk/server/mcp';
3+
import type { RecordedStep, SessionHistory } from '../types/recording';
4+
import { getBrowser } from '../tools/browser.tool';
5+
6+
function getState() {
7+
return (getBrowser as any).__state as {
8+
currentSession: string | null;
9+
sessionHistory: Map<string, SessionHistory>;
10+
};
11+
}
12+
13+
export function appendStep(
14+
toolName: string,
15+
params: Record<string, unknown>,
16+
status: 'ok' | 'error',
17+
durationMs: number,
18+
error?: string,
19+
): void {
20+
const state = getState();
21+
const sessionId = state.currentSession;
22+
if (!sessionId) return;
23+
24+
const history = state.sessionHistory.get(sessionId);
25+
if (!history) return;
26+
27+
const step: RecordedStep = {
28+
index: history.steps.length + 1,
29+
tool: toolName,
30+
params,
31+
status,
32+
durationMs,
33+
timestamp: new Date().toISOString(),
34+
...(error !== undefined && { error }),
35+
};
36+
history.steps.push(step);
37+
}
38+
39+
export function getSessionHistory(): Map<string, SessionHistory> {
40+
return getState().sessionHistory;
41+
}
42+
43+
function extractErrorText(result: Awaited<ReturnType<ToolCallback>>): string {
44+
const textContent = result.content.find((c: any) => c.type === 'text');
45+
return textContent ? (textContent as any).text : 'Unknown error';
46+
}
47+
48+
export function withRecording(toolName: string, callback: ToolCallback): ToolCallback {
49+
return async (params, extra) => {
50+
const start = Date.now();
51+
const result = await callback(params, extra);
52+
const isError = (result as any).isError === true;
53+
appendStep(
54+
toolName,
55+
params as Record<string, unknown>,
56+
isError ? 'error' : 'ok',
57+
Date.now() - start,
58+
isError ? extractErrorText(result) : undefined,
59+
);
60+
return result;
61+
};
62+
}

0 commit comments

Comments
 (0)