Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .trailblaze-sync
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
593c43b070fc9636da56b078c7beb062d44bc242
2 changes: 1 addition & 1 deletion docs/generated/external-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ Toolsets are declared in `trailmaps/<id>/toolsets/*.yaml`. They are pure YAML gr
| `android_primitives` | Yes | `android-ondevice-accessibility`, `android-ondevice-instrumentation` | 5 |
| `compose_core` | No | `compose` | 6 |
| `compose_verification` | No | `compose` | 2 |
| `core_interaction` | Yes | `android-ondevice-accessibility`, `android-ondevice-instrumentation`, `ios-host` | 16 |
| `core_interaction` | Yes | `android-ondevice-accessibility`, `android-ondevice-instrumentation`, `ios-host` | 17 |
| `memory` | No | `all drivers` | 8 |
| `meta` | Yes | `all drivers` | 2 |
| `mobile_primitives` | Yes | `android-ondevice-accessibility`, `android-ondevice-instrumentation`, `ios-host` | 4 |
Expand Down
4 changes: 3 additions & 1 deletion docs/generated/functions/custom/assertVisible.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

# `assertVisible`

Assert an element is visible on screen by its ref ID from the snapshot. Use the short hash ref shown in square brackets (e.g., y778 from [y778] "Network & internet"). These refs are stable across captures of the same screen.
Assert an element is visible on screen by its ref ID from the snapshot. Use the short hash ref shown in square brackets (e.g., y778 from [y778] "Network & internet"). These refs are stable across captures of the same screen. Optionally pass `expectedText` to also verify the element's rendered text — use it whenever the case asks to verify a specific value (e.g. "verify the checkout button shows $5.00", "expect status to be Active") instead of just confirming the element exists.

## Source

Expand All @@ -26,6 +26,8 @@ Assert an element is visible on screen by its ref ID from the snapshot. Use the

### Optional parameters

- `expectedText` — `String`
Optional. When set, asserts the resolved element's rendered text equals this value after whitespace trimming (case-sensitive). Pass the stable rendered text verbatim — e.g. "Charge $5.00", not "the checkout button". Exclude volatile state that changes run-to-run (live item counts like "3 items", timestamps, quantities); pin only the part that stays constant. Leave null when only the element's presence matters.
- `reasoning` — `String`

## Output
Expand Down
35 changes: 35 additions & 0 deletions docs/generated/functions/custom/clearText.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<!-- GENERATED BY trailblaze check. DO NOT EDIT. -->
<!-- Trailblaze framework tool reference -->
<!-- Regenerate with: ./gradlew :docs:generator:run -->

# `clearText`

Clear all text from the currently focused text field. Use BEFORE `inputText` when you need
to replace whatever's already in a field (search bar, amount field, form input). Takes no
parameters — the tool reads the field's current length from the view hierarchy.

Prefer this over `eraseText` whenever your intent is "wipe the field, then type fresh". Use
`eraseText` only when you genuinely need to remove a specific number of trailing characters
(e.g. backspacing one digit off an amount).

## Source

- Kind: class-backed
- Class: `xyz.block.trailblaze.toolcalls.commands.ClearTextTrailblazeTool`

## Contract

- Visible to LLM: yes (`surface_to_llm: true`)
- Recordable: yes (`is_recordable: true`)
- Host-only: no (`requires_host: false`)

## Input schema

_(no parameters)_

## Output

Returns: `string` (opaque text content)

Typed result schemas (`kind: query | action`, MCP `structuredContent`) are not yet carried by the resolved manifest — this section will gain detail when that lands.

11 changes: 10 additions & 1 deletion examples/playwright-electron/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,18 @@ val npmInstallElectron by tasks.registering(Exec::class) {
onlyIf { gradle.taskGraph.hasTask(tasks.test.get()) }
}

val downloadElectronBinary by tasks.registering(Exec::class) {
description = "Download the Electron platform binary"
workingDir = sampleAppDir
commandLine("sh", "provision-electron.sh")
dependsOn(npmInstallElectron)
outputs.upToDateWhen { false }
onlyIf { gradle.taskGraph.hasTask(tasks.test.get()) }
}

tasks.test {
useJUnitPlatform()
dependsOn(npmInstallElectron)
dependsOn(downloadElectronBinary)
workingDir = rootProject.projectDir.resolve("opensource")

// Pass paths to the Electron binary and app directory
Expand Down
36 changes: 36 additions & 0 deletions examples/playwright-electron/sample-app/provision-electron.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/usr/bin/env bash
# Ensure Electron's prebuilt platform binary is present under node_modules/electron/dist.
#
# npm >=11.16 no longer runs Electron's postinstall (the step that downloads this binary), so we fetch
# it explicitly. If a download base URL has been written to /tmp/electron-download-base (some CI agents
# set this because their egress can't reach github.com release assets), pull and extract the archive
# from there; otherwise fall back to Electron's own installer, which downloads from github.com.
set -euo pipefail
cd "$(dirname "$0")"

base_file=/tmp/electron-download-base
rm -rf node_modules/electron/dist node_modules/electron/path.txt

if [ -f "$base_file" ]; then
base="$(cat "$base_file")"
ver="$(node -p "require('electron/package.json').version")"
plat="$(node -p "process.platform + '-' + process.arch")"
curl -fsSL -o /tmp/electron-archive.zip "${base}v${ver}/electron-v${ver}-${plat}.zip"
size="$(wc -c < /tmp/electron-archive.zip)"
if [ "$size" -lt 50000000 ]; then
printf 'electron archive too small (%s bytes) — the download base did not serve a binary\n' "$size" >&2
head -c 200 /tmp/electron-archive.zip >&2
exit 1
fi
mkdir -p node_modules/electron/dist
unzip -q -o /tmp/electron-archive.zip -d node_modules/electron/dist
case "$(node -p process.platform)" in
darwin) printf 'Electron.app/Contents/MacOS/Electron' > node_modules/electron/path.txt ;;
win32) printf 'electron.exe' > node_modules/electron/path.txt ;;
*) printf 'electron' > node_modules/electron/path.txt ;;
esac
else
node node_modules/electron/install.js
fi

test -f node_modules/electron/path.txt
2 changes: 2 additions & 0 deletions gradle/libs.versions.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ kotlinx-serialization = "1.10.0"
ktor = "3.3.3"
logback = "1.5.21"
maestro = "2.3.0"
micrometer = "1.15.12"
mcp-sdk = "0.11.1"
multiplatformMarkdownRenderer = "0.39.2"
multiplatformSettings = "1.3.0"
Expand Down Expand Up @@ -125,6 +126,7 @@ ktor-client-okhttp = { module = "io.ktor:ktor-client-okhttp", version.ref = "kto
ktor-client-core-jvm = { module = "io.ktor:ktor-client-core-jvm", version.ref = "ktor" }
ktor-client-core = { module = "io.ktor:ktor-client-core", version.ref = "ktor" }
ktor-client-js = { module = "io.ktor:ktor-client-js", version.ref = "ktor" }
ktor-client-mock = { module = "io.ktor:ktor-client-mock", version.ref = "ktor" }
kotlin-csv = { module = "com.github.doyaaaaaken:kotlin-csv-jvm", version.ref = "kotlin-csv" }
kotlinpoet = { module = "com.squareup:kotlinpoet", version.ref = "kotlinpoet" }
ktor-http = { module = "io.ktor:ktor-http", version.ref = "ktor" }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -741,6 +741,41 @@ class BlazeGoalPlanner(
)
}

// Step 5b: Target discipline — when the step's named target isn't among the tappable
// refs, don't let the LLM tap an unrelated clickable (the Hardware-Hub trap). Either the
// target is scrolled off (affordance present → scroll toward it) or it isn't on this
// screen at all (surface the wrong-screen signal and stop).
val screenText = screenState.viewHierarchyTextRepresentation
when (
detectTargetMissingRecovery(
objective = currentObjective,
screenText = screenText,
recommendedTool = analysis.recommendedTool,
)
) {
// Skip the distractor tap and steer the NEXT analysis to scroll the named direction.
// buildProgressSummary surfaces the latest reflection note into the next iteration's
// prompt, so this directive reaches the analyzer before it chooses again.
TargetMissingRecovery.SCROLL_TO_REVEAL -> {
val direction = extractTargetPhrase(currentObjective)
?.let { scrollDirectionFromAffordance(it, screenText) }
?: "down"
return state.copy(
iteration = state.iteration + 1,
reflectionNotes = state.reflectionNotes +
"[Target discipline] The named target is off-screen. Do NOT tap an unrelated " +
"element — scroll $direction to reveal it, then act on it.",
)
}
TargetMissingRecovery.WRONG_SCREEN -> return state.copy(
stuck = true,
stuckReason = WRONG_SCREEN_MESSAGE,
screenSummary = analysis.screenSummary,
iteration = state.iteration + 1,
)
TargetMissingRecovery.PROCEED -> Unit
}

// Step 6: Add low-confidence reflection note before execution
val stateForExecution = if (config.enableReflection && analysis.confidence == Confidence.LOW) {
val reflectionNote = "Low confidence on iteration ${state.iteration + 1}: " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,121 @@ private data class ProgressInfo(
val correction: String? = null,
)

/**
* Outcome of the target-discipline check: when the step's named target is absent from the
* tappable refs, the agent must recover rather than tap an unrelated distractor.
*
* @see detectTargetMissingRecovery
*/
enum class TargetMissingRecovery {
/** Target is present (or no named target) — proceed with the recommended action. */
PROCEED,

/** A scroll-to-reveal affordance for the target is on screen — scroll toward it, then retry. */
SCROLL_TO_REVEAL,

/** No target and no affordance anywhere — likely the wrong screen; surface and stop. */
WRONG_SCREEN,
}

/** Human-facing message surfaced when the target is not on this screen at all. */
const val WRONG_SCREEN_MESSAGE: String =
"target not found on this screen — you may be on the wrong screen; this step may need to be revised"

/** Tap-style tool name fragments. Matches `tap`, `tapOnElementByNodeId`, `compose_click`, etc. */
private val TAP_TOOL_FRAGMENTS = listOf("tap", "click")

/**
* Decides whether the agent should avoid tapping an unrelated clickable when the step's named
* target isn't present among the tappable refs (the Hardware-Hub trap).
*
* Only engages for tap-style actions — scrolls, inputs, waits, and status calls are left alone.
* The decision is driven entirely by the compact snapshot text the analyzer saw:
* - The target text appears in the snapshot → [TargetMissingRecovery.PROCEED] (the LLM can act).
* - The target is absent but a `(scroll … to reveal)` affordance is present →
* [TargetMissingRecovery.SCROLL_TO_REVEAL] (the offscreen-target flavor: scroll, don't tap).
* - The target is absent and no affordance exists → [TargetMissingRecovery.WRONG_SCREEN]
* (the no-target flavor: surface the wrong-screen signal and stop).
*
* @param objective The step's natural-language instruction (e.g. "Tap the search field").
* @param screenText The compact view-hierarchy snapshot text the analyzer was shown.
* @param recommendedTool The tool the analyzer chose this iteration.
*/
fun detectTargetMissingRecovery(
objective: String,
screenText: String?,
recommendedTool: String,
): TargetMissingRecovery {
val tool = recommendedTool.lowercase()
if (TAP_TOOL_FRAGMENTS.none { it in tool }) return TargetMissingRecovery.PROCEED

val target = extractTargetPhrase(objective) ?: return TargetMissingRecovery.PROCEED

// Absent compact text is not evidence of a wrong screen: drivers like Android HOST mode
// intentionally leave viewHierarchyTextRepresentation null and feed the analyzer a JSON/tree
// fallback instead. Don't intercept when we can't see the snapshot.
val text = screenText ?: return TargetMissingRecovery.PROCEED

// The target counts as "present and tappable" only when it appears on a line carrying a
// ref marker (e.g. `[c596]`, `[n12]`). Static labels, container headers, and the
// non-tappable `(scroll … to reveal)` affordance must NOT satisfy the present check.
val lines = text.lines()
if (lines.any { it.containsRefMarker() && it.contains(target, ignoreCase = true) }) {
return TargetMissingRecovery.PROCEED
}

val affordanceLines = lines.filter { it.contains("to reveal)", ignoreCase = true) }
return if (affordanceLines.any { it.contains(target, ignoreCase = true) }) {
TargetMissingRecovery.SCROLL_TO_REVEAL
} else {
TargetMissingRecovery.WRONG_SCREEN
}
}

/**
* Matches a compact element-list ref marker — one letter, 1-3 digits, optional collision
* suffix letter, e.g. `[a1]`, `[k42]`, `[z103]`, `[k42b]` (see [xyz.block.trailblaze.api.ElementRef]).
* Deliberately excludes state annotations like `[checked]`, `[disabled]`, `[id=…]`.
*/
private val REF_MARKER = Regex("\\[[a-z]\\d{1,3}[a-z]?]", RegexOption.IGNORE_CASE)

private fun String.containsRefMarker(): Boolean = REF_MARKER.containsMatchIn(this)

/**
* Reads the scroll direction ("up" / "down") from the target's `(scroll … to reveal)`
* affordance line, so the recovery note can name the concrete direction. Defaults to "down".
*/
fun scrollDirectionFromAffordance(target: String, screenText: String?): String {
val line = screenText?.lines()?.firstOrNull {
it.contains("to reveal)", ignoreCase = true) && it.contains(target, ignoreCase = true)
} ?: return "down"
return if (line.contains("scroll up", ignoreCase = true)) "up" else "down"
}

/**
* Extracts the named target phrase from a step instruction so it can be checked against the
* snapshot. Prefers an explicitly quoted phrase; otherwise takes the noun phrase after a
* leading action verb (tap/select/find/search/open/choose). Returns null when no specific
* target can be identified (e.g. "go back", "scroll down") — those steps are left to proceed.
*/
internal fun extractTargetPhrase(objective: String): String? {
Regex("[\"“']([^\"”']{2,})[\"”']").find(objective)?.let { return it.groupValues[1].trim() }

val verb = Regex(
"^\\s*(?:tap|click|select|find|search\\s+for|open|choose|press)\\s+(?:on\\s+|the\\s+)*(.+)$",
RegexOption.IGNORE_CASE,
).find(objective) ?: return null

return verb.groupValues[1]
.trim()
.removeSuffix(".")
.removeSuffix(" field")
.removeSuffix(" button")
.removeSuffix(" icon")
.trim()
.takeIf { it.length >= 2 }
}

/**
* Determines whether reflection should be triggered based on state.
*
Expand Down
Loading
Loading