Skip to content

docs(.agents/skills): restructure e2e-phantom-diff-testing per Agent Skills spec; rename e2e-testing → general-e2e-testing#426

Open
Aaron ("AJ") Steers (aaronsteers) wants to merge 6 commits intomainfrom
devin/1776363806-phantom-diff-skill-repo
Open

docs(.agents/skills): restructure e2e-phantom-diff-testing per Agent Skills spec; rename e2e-testing → general-e2e-testing#426
Aaron ("AJ") Steers (aaronsteers) wants to merge 6 commits intomainfrom
devin/1776363806-phantom-diff-skill-repo

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Apr 16, 2026

Summary

Restructures the repo-local e2e-phantom-diff-testing skill to follow the Agent Skills specification (progressive disclosure: short SKILL.md with frontmatter → scripts/assets/references/), and renames the sibling e2e-testing skill to general-e2e-testing with cross-references between the two per review feedback.

Docs-only — no provider code, CI, or test config changes.

New layout under .agents/skills/e2e-phantom-diff-testing/:

  • SKILL.md — ~195-line driver with YAML frontmatter (name, description, related), a fast-path scripts/run.sh invocation, and per-step pointers into the scripts. Ends with a "Please improve this skill" section instructing agents to proactively surface faults (stale jq paths, broken assertions, missing steps) and queue them as follow-up PR proposals for the user — explicitly forbidding rogue PRs.
  • scripts/ — executable per-step scripts sharing a _lib.sh (log / fail / require_env / require_bin helpers):
    • run.sh — orchestrator (--with-v112 --keep --verbose)
    • create-connection.sh — API token + multi-stream connection with explicit [] arrays
    • patch-state.sh — jq transform + sanity assertion on the patched state
    • assert-no-changes.sh — parameterized plan assertion (used for steps 6 and 9)
    • repro-v112.sh — advisory registry-pinned leg (warns rather than fails on "No changes")
    • assert-intentional-change.sh — restores pre-patch state, flips sync_mode, asserts single in-place change
  • assets/envsubst templates for main.tf, .terraformrc-post-fix, .terraformrc-v112.
  • references/RATIONALE.md — "why" content (three streams, API-not-terraform, jq-not-upgrade, -refresh=false, bare-value sed), Known Limitations, and the Troubleshooting table that used to be inline.

Sibling rename: e2e-testing/SKILL.mdgeneral-e2e-testing/SKILL.md. Content is byte-identical except for a 6-line "Related skill" note at the top pointing to e2e-phantom-diff-testing. The phantom-diff SKILL.md likewise points back via its frontmatter related: field and a Related links section.

Review & Testing Checklist for Human

  • Run ./.agents/skills/e2e-phantom-diff-testing/scripts/run.sh once end-to-end against the Devin Sandbox. The previous inline-bash version of this skill was exercised live; the extracted-scripts version has only been bash-syntax-checked (bash -n). The most likely regression is envsubst mishandling the ${...} references inside HCL — HCL uses ${} for its own interpolation and the templates rely on envsubst to swap only the named variables. The terraform block keyword version = "~> 1.1.0" contains no env vars and should pass through unchanged, but worth eyeballing the materialized main.tf during a real run.
  • Sanity-check the jq path in scripts/patch-state.sh (.resources[].instances[].attributes.configurations.streams). If the airbyte_connection state shape ever moves streams off attributes.configurations, the primary regression assertion silently becomes a no-op. The sanity-assertion in that script (expect ≥2 null full_refresh streams) is the guardrail, but confirm it actually fires if the path drifts.
  • Confirm the framing of --with-v112 as advisory (warns, returns 0) still matches your intent. If you'd rather have the v1.1.2 leg be a hard repro gate, scripts/repro-v112.sh needs to fail instead of log WARN on "No changes", and the language in references/RATIONALE.md (Known Limitations) needs tightening.
  • The step-9 -refresh=true fix (below) meaningfully changes what the post-apply assertion catches. If the provider currently has benign post-apply API-side normalization that wasn't surfaced before, the new step 9 will now surface it as a hard failure. That would be a real finding worth investigating — not a reason to revert the fix — but worth being aware of during the first live run.

Updates since last revision

  • Review feedback fixes (commit afa94c4) addressing all 6 Copilot + Devin Review threads:
    • create-connection.sh: token-request payload now built with jq -n --arg instead of shell-interpolated JSON. Matches the safe pattern already used for connection creation later in the same file. Protects against JSON-special characters in AIRBYTE_CLOUD_CLIENT_ID/_SECRET.
    • repro-v112.sh, assert-intentional-change.sh: switched from sed -i '...' to sed -i.bak '...' && rm -f ...bak so the scripts work on both GNU (Linux CI) and BSD (macOS) sed.
    • run.sh step 9: now calls assert-no-changes.sh "$TF_DIR/plan-post-apply.txt" -refresh=true instead of relying on the -refresh=false default. Previously the post-apply assertion was comparing config vs local state and would miss server-side drift or unexpected provider-side normalization — exactly the opposite of what the step claims to catch. The default stays -refresh=false in assert-no-changes.sh because step 6 genuinely needs it (a refresh there would undo the jq patch from step 5).
  • Previous revision: fully rewrote SKILL.md (439-line monolith of inline bash → 196-line driver with frontmatter), added 6 scripts + 3 asset templates + rationale reference doc, added the "Please improve this skill" self-maintenance section, renamed e2e-testinggeneral-e2e-testing and cross-linked the two skills as "related".

Notes

  • Pre-existing failing required check Zero-Diff Check (Generated Code) on internal/sdk/sdk.go and docs/index.md is unrelated to this PR — my branch only adds/moves files under .agents/skills/. Same drift blocked test(e2e): add phantom-diff regression script #425. Fix is /generate on the PR, which I can't trigger as a bot.
  • An earlier draft of this skill lived as airbytehq/ai-skills#246. Per your guidance that repo-specific skills belong in the repo itself, that PR is being abandoned in favor of this one.
  • I did not refactor general-e2e-testing/SKILL.md beyond the rename + cross-reference, per your instruction to leave it otherwise unchanged.

Link to Devin session: https://app.devin.ai/sessions/3479ee7090b84744a4b34626d6ee203b
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Sibling to the existing e2e-testing skill, but scoped specifically to the
null-vs-empty-list phantom diff fix from #423.

Forces the bug's exact state shape via jq rather than relying on
environmental SDK decoding, so the primary regression assertion is
deterministic. Includes an optional v1.1.2 registry leg (advisory, per
repro flakiness) and a smoke test that confirms intentional changes are
still surfaced correctly.
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for opening this pull request!

Your contribution is appreciated. Here are some helpful tips and resources.

💡 Show Tips and Tricks

Terraform Example Commands

  • /tf-examples project=pre-1.0 action=plan - Run terraform plan on the pre-1.0 example project
  • /tf-examples project=1.0 action=plan - Run terraform plan on the 1.0 example project
  • /tf-examples project=all action=plan - Run terraform plan on all example projects
  • /tf-examples project=pre-1.0 action=apply - Apply terraform changes to the pre-1.0 example project
  • /tf-examples project=pre-1.0 action=destroy - Destroy terraform resources in the pre-1.0 example project
📚 Show Repo Guidance

About This Repository

This repository uses Speakeasy to generate the Terraform provider from the Airbyte OpenAPI specification. The CI will automatically build the provider, validate code generation, and run acceptance tests across Terraform versions 1.0-1.4.

Note: This is a generated codebase. Direct modifications to generated files are not accepted - changes must be made to the upstream OpenAPI spec.

📝 Edit this welcome message.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

Generated Code Drift Detected

The committed code does not match what the generation pipeline produces. This usually means a generated file was edited by hand or the generation pipeline needs to be re-run.

To fix: Comment /generate on this PR to regenerate and push updated code.

To debug locally: Download the generated artifacts and diff them against your committed code:

gh run download 24580753571 --name generated_provider_code --dir /tmp/generated_from_ci
diff -rq internal/provider/ /tmp/generated_from_ci/provider/
diff -rq --exclude=.gitattributes --exclude=.gitignore internal/sdk/ /tmp/generated_from_ci/sdk/

See CONTRIBUTING.md — Debugging Generation Drift Failures for the full procedure.

Diff summary
Files internal/sdk/sdk.go and /tmp/generated/sdk/sdk.go differ

Comment thread .agents/skills/e2e-phantom-diff-testing/SKILL.md
Per PR review: prefix with e2e- to match the sibling e2e-testing skill's
naming convention.
…/assets/references; rename e2e-testing -> general-e2e-testing

Follows the Agent Skills specification (https://agentskills.io/specification):
- SKILL.md is now a short driver with YAML frontmatter and per-step pointers.
- scripts/ holds standalone runnable steps (run.sh orchestrator + _lib.sh + per-step assertions).
- assets/ holds envsubst templates for main.tf and the two .terraformrc files.
- references/RATIONALE.md holds the why/known-limitations/troubleshooting content.

Renames the sibling e2e-testing skill to general-e2e-testing and cross-references each skill from the other per review feedback. Content of general-e2e-testing is otherwise unchanged.
@devin-ai-integration devin-ai-integration Bot changed the title docs(.agents/skills): add phantom-diff-testing skill docs(.agents/skills): restructure e2e-phantom-diff-testing per Agent Skills spec; rename e2e-testing → general-e2e-testing Apr 16, 2026
@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review April 17, 2026 18:15
Copilot AI review requested due to automatic review settings April 17, 2026 18:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures the repo-local .agents/skills/e2e-phantom-diff-testing skill to match the Agent Skills spec (driver SKILL.md + scripts/ + assets/ + references/) and renames the sibling E2E skill to general-e2e-testing, adding cross-links between the two skills.

Changes:

  • Add a new e2e-phantom-diff-testing skill layout with an orchestrated bash harness and supporting templates/reference material.
  • Extract the phantom-diff E2E flow into step scripts (create-connection, patch-state, assert-*, optional repro-v112) composed by scripts/run.sh.
  • Update general-e2e-testing/SKILL.md with a “related skill” pointer to the phantom-diff skill.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
.agents/skills/general-e2e-testing/SKILL.md Adds a cross-reference to the phantom-diff skill and guidance on when to use each.
.agents/skills/e2e-phantom-diff-testing/SKILL.md New skill driver with frontmatter, fast-path invocation, step map, and self-maintenance guidance.
.agents/skills/e2e-phantom-diff-testing/scripts/run.sh Orchestrates the end-to-end flow (build, scaffold TF, API create, import, patch, plan assertions, optional v1.1.2 leg, apply, re-plan, cleanup).
.agents/skills/e2e-phantom-diff-testing/scripts/_lib.sh Shared bash helpers (log, fail, require_env, require_bin).
.agents/skills/e2e-phantom-diff-testing/scripts/create-connection.sh Mints token and creates a multi-stream connection via Airbyte API with explicit empty arrays.
.agents/skills/e2e-phantom-diff-testing/scripts/patch-state.sh JQ-based state patch to force the null-vs-empty mismatch and sanity-check the patch applied.
.agents/skills/e2e-phantom-diff-testing/scripts/assert-no-changes.sh Runs terraform plan and asserts the output contains “No changes”, writing output to a caller-specified file.
.agents/skills/e2e-phantom-diff-testing/scripts/assert-intentional-change.sh Restores pre-patch state, edits HCL, and asserts exactly one in-place change is detected.
.agents/skills/e2e-phantom-diff-testing/scripts/repro-v112.sh Optional advisory leg that pins provider to v1.1.2 and runs the same patched-state plan.
.agents/skills/e2e-phantom-diff-testing/assets/main.tf.tmpl Template TF config for provider auth and a test connection resource.
.agents/skills/e2e-phantom-diff-testing/assets/terraformrc-post-fix.tmpl TF CLI config template using dev_overrides to point at the locally built provider binary.
.agents/skills/e2e-phantom-diff-testing/assets/terraformrc-v112.tmpl TF CLI config template for registry-only installs (no dev overrides).
.agents/skills/e2e-phantom-diff-testing/references/RATIONALE.md Rationale/limitations/troubleshooting content moved out of the main skill driver.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +29 to +31
TOKEN="$(curl -sS -X POST "$AIRBYTE_API_URL/applications/token" \
-H "Content-Type: application/json" \
-d "{\"client_id\":\"$AIRBYTE_CLOUD_CLIENT_ID\",\"client_secret\":\"$AIRBYTE_CLOUD_CLIENT_SECRET\",\"grant-type\":\"client_credentials\"}" \
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The token request JSON is built via string interpolation. If AIRBYTE_CLOUD_CLIENT_ID or AIRBYTE_CLOUD_CLIENT_SECRET contains characters that need JSON escaping (quotes, backslashes, newlines), the request body will become invalid. Build this payload with jq -n --arg ... (similar to how BODY is constructed below) to ensure correct escaping.

Suggested change
TOKEN="$(curl -sS -X POST "$AIRBYTE_API_URL/applications/token" \
-H "Content-Type: application/json" \
-d "{\"client_id\":\"$AIRBYTE_CLOUD_CLIENT_ID\",\"client_secret\":\"$AIRBYTE_CLOUD_CLIENT_SECRET\",\"grant-type\":\"client_credentials\"}" \
TOKEN_BODY=$(jq -n \
--arg client_id "$AIRBYTE_CLOUD_CLIENT_ID" \
--arg client_secret "$AIRBYTE_CLOUD_CLIENT_SECRET" \
'{
client_id: $client_id,
client_secret: $client_secret,
"grant-type": "client_credentials"
}')
TOKEN="$(curl -sS -X POST "$AIRBYTE_API_URL/applications/token" \
-H "Content-Type: application/json" \
-d "$TOKEN_BODY" \

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed in afa94c4. Rebuilt the token payload with jq --arg to match how the connection-creation body is already constructed — no more shell-interpolated JSON in this script.


Devin session

rm -rf "$V112_DIR"
cp -r "$TF_DIR" "$V112_DIR"
rm -rf "$V112_DIR/.terraform" "$V112_DIR/.terraform.lock.hcl"
sed -i 's#version = "~> 1.1.0"#version = "= 1.1.2"#' "$V112_DIR/main.tf"
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sed -i is not portable across GNU/BSD sed (macOS requires an explicit backup suffix). As written, this line will fail on macOS and some other environments. Consider using a cross-platform form like sed -i.bak ... (and delete the backup) or avoid in-place editing entirely by writing to a temp file and moving it into place.

Suggested change
sed -i 's#version = "~> 1.1.0"#version = "= 1.1.2"#' "$V112_DIR/main.tf"
sed -i.bak 's#version = "~> 1.1.0"#version = "= 1.1.2"#' "$V112_DIR/main.tf"
rm -f "$V112_DIR/main.tf.bak"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in afa94c4 — switched to sed -i.bak 's#...#...#' ... && rm -f ...bak so the script works on both GNU and BSD sed.


Devin session

cp "$TF_DIR/terraform.tfstate.pre-patch" "$TF_DIR/terraform.tfstate"
# Match on the bare value, not `sync_mode = "incremental_append"`, because HCL
# formatting pads the `=` with variable whitespace across TF versions.
sed -i 's/"incremental_append"/"full_refresh_append"/' "$TF_DIR/main.tf"
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sed -i is not portable across GNU/BSD sed (macOS requires an explicit backup suffix). This will break the skill for contributors on macOS. Use a cross-platform in-place edit approach (e.g., sed -i.bak ... + cleanup) or rewrite the file via a temp file + move.

Suggested change
sed -i 's/"incremental_append"/"full_refresh_append"/' "$TF_DIR/main.tf"
sed -i.bak 's/"incremental_append"/"full_refresh_append"/' "$TF_DIR/main.tf"
rm -f "$TF_DIR/main.tf.bak"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in afa94c4 — same cross-platform sed -i.bak ... && rm -f ...bak treatment as repro-v112.sh.


Devin session

terraform -chdir="$TF_DIR" apply -auto-approve -no-color \
>"$TF_DIR/apply-post-fix.txt" 2>&1
TF_CLI_CONFIG_FILE="$TF_DIR/.terraformrc-post-fix" \
"$SCRIPT_DIR/assert-no-changes.sh" "$TF_DIR/plan-post-apply.txt"
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert-no-changes.sh defaults to -refresh=false when no extra args are provided. In step 9 (post-apply re-plan), run.sh calls it without args, so the re-plan will not refresh from the API and won’t actually catch drift/unexpected remote-side changes after apply. Pass -refresh=true (or no -refresh flag) for the post-apply check, and keep -refresh=false only for the patched-state assertions.

Suggested change
"$SCRIPT_DIR/assert-no-changes.sh" "$TF_DIR/plan-post-apply.txt"
"$SCRIPT_DIR/assert-no-changes.sh" -refresh=true "$TF_DIR/plan-post-apply.txt"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in afa94c4 — step 9 now passes -refresh=true explicitly so the post-apply plan actually hits the live API. -refresh=false stays as the default in assert-no-changes.sh (step 6 still needs it to avoid undoing the jq patch on step 5's state).


Devin session

Comment on lines +133 to +136
9. **Apply + replan.** Inline in `run.sh`: `terraform apply
-auto-approve` then another `assert-no-changes.sh`. Catches "apply
produced unexpected drift" bugs that a pure plan-only check would
miss.
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description says the apply + follow-up plan step “catches drift”, but the current harness runs assert-no-changes.sh which (by default) uses terraform plan -refresh=false. With -refresh=false, this step only compares config vs local state and won’t detect server-side drift or unexpected API-normalization after apply. Align the docs with the actual behavior, or (preferably) run the post-apply plan with refresh enabled.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — real bug, not a doc mismatch. assert-no-changes.sh defaults to -refresh=false (correct for the patched-state assertion in step 6), and run.sh was calling it the same way in step 9 post-apply, which meant the post-apply leg was just comparing config vs local state. Fixed in afa94c4: step 9 now calls assert-no-changes.sh "$TF_DIR/plan-post-apply.txt" -refresh=true so the post-apply plan hits the live API.


Devin session

devin-ai-integration[bot]

This comment was marked as resolved.

…pilot + Devin Review

- create-connection.sh: build token-request payload with `jq --arg` so
  JSON-special characters in $AIRBYTE_CLOUD_CLIENT_ID / _SECRET can't
  produce a malformed request body. Connection creation already did this;
  token request was the inconsistent path.
- repro-v112.sh, assert-intentional-change.sh: use `sed -i.bak ... && rm`
  so the scripts work on both GNU (Linux) and BSD (macOS) sed.
- run.sh step 9: pass `-refresh=true` explicitly to the post-apply
  assert-no-changes.sh call. The default `-refresh=false` is correct for
  the patched-state assertions (steps 6/9-post-apply were previously
  symmetric), but for the post-apply leg we want to hit the live API so
  server-side drift or unexpected post-apply normalization is caught.
devin-ai-integration[bot]

This comment was marked as resolved.

Catches the stale reference Devin Review flagged — renaming the skill
directory in the previous commit left this cross-link dead.
devin-ai-integration[bot]

This comment was marked as resolved.

…testing

Second stale reference caught by Devin Review — the rename missed the
comment at the top of the e2e-smoke-test workflow. No behavioral change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants