Skip to content

feat: improve generate-classes skill score 81% → 93%#13892

Closed
yogesh-tessl wants to merge 1 commit into
apache:masterfrom
yogesh-tessl:improve/skill-review-optimization
Closed

feat: improve generate-classes skill score 81% → 93%#13892
yogesh-tessl wants to merge 1 commit into
apache:masterfrom
yogesh-tessl:improve/skill-review-optimization

Conversation

@yogesh-tessl
Copy link
Copy Markdown

Hey @wu-sheng 👋

impressive work. 24k+ stars on an APM system that covers tracing, metrics, and logging across cloud-native stacks. The breadth of language agents and the clean separation between the OAP backend and UI is impressive for a project this large.

ran your skills through tessl skill review at work and found some targeted improvements for the generate-classes skill. Here's the before/after:

Skill Before After Change
generate-classes 81% 93% +12%
Changes made to generate-classes
  • Expanded description with explicit "Use when..." clause covering compiling DSL scripts, inspecting generated bytecode, debugging compiler output, and verifying DSL-to-class generation - pushes description score from 68% to 100%
  • Added specific trigger terms (ANTLR4, Javassist, bytecode generation) to improve agent discoverability
  • Added validation checkpoints after each Maven command - verify exit code 0 / BUILD SUCCESS before inspecting output, with guidance on checking for DSL compilation errors on failure
  • Moved argument-hint from unknown frontmatter key into a proper metadata block to resolve the validation warning
  • Added stop-on-failure guidance for the all command to prevent cascading errors
  • Quoted description string in frontmatter to follow standard YAML formatting

also stress-tested your generate-classes skill against a few real-world task evals and it held up really well on MAL expression compilation with multi-script batch generation. Kudos for that.

quick honest disclosure. I work at https://github.com/tesslio where we build tooling around skills like these. Not a pitch, just saw room for improvement and wanted to contribute.

if you want to self-improve your skills, or define your own scenarios to pressure test, just ask your agent (Claude Code, Codex, etc.) to evaluate and optimize your skill with Tessl. Ping me @yogesh-tessl, if you hit any snags.

Hey @wu-sheng 👋

I ran your skills through `tessl skill review` at work and found some targeted improvements for the `generate-classes` skill. Here's the full before/after:

| Skill | Before | After | Change |
|-------|--------|-------|--------|
| generate-classes | 81% | 93% | +12% |
| gh-pull-request | 84% | — | — |
| license | 93% | — | — |
| test | 85% | — | — |
| new-monitoring-feature | 81% | — | — |
| compile | 79% | — | — |
| run-e2e | 60% | — | — |
| ci-e2e-debug | 84% | — | — |
| package | 90% | — | — |

<details>
<summary>Changes made to generate-classes</summary>

- Expanded description with explicit "Use when..." clause covering compiling DSL scripts, inspecting generated bytecode, debugging compiler output, and verifying DSL-to-class generation — pushes description score from 68% to 100%
- Added specific trigger terms (ANTLR4, Javassist, bytecode generation) to improve agent discoverability
- Added validation checkpoints after each Maven command — verify exit code 0 / BUILD SUCCESS before inspecting output, with guidance on checking for DSL compilation errors on failure
- Moved argument-hint from unknown frontmatter key into a proper metadata block to resolve the validation warning
- Added stop-on-failure guidance for the all command to prevent cascading errors
- Quoted description string in frontmatter to follow standard YAML formatting

</details>

I also stress-tested your generate-classes skill against a few real-world task evals and it held up really well on MAL expression compilation with multi-script batch generation. Kudos for that.

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch — just saw room for improvement and wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide (https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices) and ask it to optimize your skill. Ping me — @yogesh-tessl (https://github.com/yogesh-tessl) — if you hit any snags.

Thanks in advance 🙏
@wu-sheng wu-sheng added the AI Assistant Claude and other AI Coding Tooling label Jun 5, 2026
@wu-sheng
Copy link
Copy Markdown
Member

wu-sheng commented Jun 5, 2026

Thanks for the interest in SkyWalking and the kind words, @yogesh-tessl 🙏

One correction on the frontmatter change, though. These files live under .claude/skills/, so the schema that actually governs them is Claude Code's own frontmatter reference:

  • argument-hint is a recognized top-level key there — it's what drives the /generate-classes autocomplete hint.
  • metadata is not in that reference, so Claude Code silently ignores it.

So moving argument-hint under a metadata: block doesn't make it "proper" — it actually hides the hint from the only tool that consumes these files, and it makes this skill the lone outlier (our other six skills all use a top-level argument-hint). The "validation warning" you saw is from Tessl's schema, which differs from Claude Code's on this point.

Same with the 81% → 93% "skill score": that's a Tessl metric. Claude Code has no native skill score, so it's not something we can reason about or gate on from our side.

The expanded "Use when…" description is a fair improvement and matches Claude Code's own "put the key use case first" guidance — happy to keep that part. Could you revert the metadata: move (restore the top-level argument-hint) and drop the repeated "verify exit code 0 / BUILD SUCCESS" lines? An agent running mvn test already sees the failure, so they add length without changing behavior. With those two tweaks it's an easy merge.

Thanks again for taking the time to look. 🙇

@wu-sheng wu-sheng closed this Jun 5, 2026
@wu-sheng wu-sheng added the rejected The issue or PR can't be accepted by upstream. label Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Assistant Claude and other AI Coding Tooling rejected The issue or PR can't be accepted by upstream.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants