nightshift: Error Message Improvement Analysis

# nightshift: Error Message Improvement Analysis

## Summary

Jarspect is a Rust security scanner for Minecraft `.jar` mods with an axum HTTP frontend, static analysis engine, YARA scanning, MalwareBazaar hash lookup, and AI verdict system. The codebase has ~4,500 lines across 20+ source files. Error handling uses `anyhow` with `.context()` chains and `anyhow::bail!()` macros. This analysis identifies error messages that could be more actionable, specific, or helpful for operators debugging issues.

---

## Findings

### 🔴 High Priority — Vague or unactionable error messages

**1. `"Upload not found"` — no identifier context**

- **File:** `src/scan.rs:55`
- **Current:** `anyhow::bail!("Upload not found")`
- **Problem:** When a scan request references a non-existent upload, the error message doesn't include the upload ID. Operators cannot tell which upload was requested.
- **Suggested:** `anyhow::bail!("Upload not found: {}", request.upload_id)`

**2. `"AI rate limited for too long (429)"` — missing timing context**

- **File:** `src/verdict.rs:470`
- **Current:** `anyhow::bail!("AI rate limited for too long (429)")`
- **Problem:** No indication of how long the client waited or what the total elapsed time was. Operators can't tell if the limit is 30 seconds or 15 minutes.
- **Suggested:** `anyhow::bail!("AI rate limited for too long (429): waited {}s, {} API calls made", started.elapsed().as_secs(), api_calls)`

**3. `"missing AI message content"` — no response body context**

- **File:** `src/verdict.rs:492`
- **Current:** `anyhow::bail!("missing AI message content")`
- **Problem:** The error doesn't indicate what was actually received. Was it an empty response? A different JSON structure? A null field?
- **Suggested:** Include a truncated preview: `anyhow::bail!("missing AI message content in response (keys: {})", summarize_payload_keys(&payload))`

**4. `"failed to parse AI verdict content"` — no content preview**

- **File:** `src/verdict.rs:884`
- **Current:** `anyhow::bail!("failed to parse AI verdict content")`
- **Problem:** When the AI returns an unparseable verdict, operators have no idea what was returned. Is the JSON malformed? Is a required field missing?
- **Suggested:** Include a truncated content preview and the parse error: `anyhow::bail!("failed to parse AI verdict content: {error} (content: {})", &content[..content.len().min(200)])`

**5. `"Invalid identifier format (expected 32 hex chars)"` — missing the actual input**

- **File:** `src/lib.rs:482`
- **Current:** `anyhow::bail!("Invalid identifier format (expected 32 hex chars)")`
- **Problem:** Doesn't show what was actually received.
- **Suggested:** `anyhow::bail!("Invalid identifier format: expected 32 hex chars, got {:?}", value)` (truncate long values)

### 🟡 Medium Priority — Error messages missing operational context

**6. `"Invalid .jar archive: {root_label}"` — no underlying cause**

- **File:** `src/analysis/archive.rs:36`
- **Current:** `.with_context(|| format!("Invalid .jar archive: {root_label}"))`
- **Problem:** Wraps the error but the root cause (corrupt ZIP, truncated file, wrong magic bytes) is buried in the anyhow chain. Operators see "Invalid .jar archive" but not why.
- **Suggested:** `.with_context(|| format!("Invalid .jar archive: {root_label} (check that the file is a valid ZIP/JAR)"))` or add the underlying error detail to the context message.

**7. `"failed to build MalwareBazaar HTTP client"` — no reason**

- **File:** `src/malwarebazaar.rs:43`
- **Current:** `.context("failed to build MalwareBazaar HTTP client")?`
- **Problem:** `reqwest::Client::builder().build()` rarely fails. When it does, it's usually a TLS backend issue. The generic message gives no debugging direction.
- **Suggested:** `.context("failed to build MalwareBazaar HTTP client (TLS backend issue?)")?`

**8. `"MalwareBazaar request failed"` — no status code or response body**

- **File:** `src/malwarebazaar.rs:51`
- **Current:** `.context("MalwareBazaar request failed")?`
- **Problem:** The `.send()` can fail for many reasons (DNS, timeout, connection refused, TLS). The generic "request failed" doesn't help diagnose whether it's a network issue, auth issue, or service outage.
- **Suggested:** Include the error kind or status if available.

**9. `"failed to decode MalwareBazaar response"` — no content type or length info**

- **File:** `src/malwarebazaar.rs:56`
- **Current:** `.context("failed to decode MalwareBazaar response")?`
- **Problem:** Response could be HTML (error page), non-JSON, or truncated JSON.
- **Suggested:** `.context("failed to decode MalwareBazaar response (expected JSON)")?`

**10. `"AI API returned status {status}: {payload}"` — good but inconsistent**

- **File:** `src/verdict.rs:480`
- **Current:** `anyhow::bail!("AI API returned status {status}: {payload}")`
- **Note:** This is actually one of the better error messages in the codebase — it includes both the status code and response body. However, for very large payloads, this could flood logs. Consider truncating `{payload}` to 500 chars.

### 🟡 Medium Priority — Silent failure modes

**11. Archive entry read failure silently skipped**

- **File:** `src/analysis/archive.rs:128`
- **Pattern:** `Err(error) => { /* logged at debug level, entry skipped */ }`
- **Problem:** If an archive entry fails to read (e.g., encrypted ZIP entry, corrupt data), it's logged at debug level and silently skipped. For a security scanner, missing entries could mean missing malware indicators.
- **Suggested:** Log at `warn!` level with the entry path and error. Consider surfacing a count of skipped entries in the scan response.

**12. Metadata parsing failures produce `MetadataFinding` but no actionable context**

- **File:** `src/analysis/metadata.rs:149, 284, 335`
- **Pattern:** `Err(error) => { /* creates a MetadataFinding with format! message */ }`
- **Problem:** The error messages like `"Failed to parse fabric.mod.json: {error}"` are good, but they're stored as metadata findings, not errors. They won't show up in error logs or trigger alerts.
- **Suggested:** Also log at `warn!` level when metadata parsing fails, since this could indicate a tampered or obfuscated mod.

### 🟢 Low Priority — Minor improvements

**13. `"invalid class magic"` — missing the actual magic bytes**

- **File:** `src/analysis/classfile_evidence.rs:155`
- **Current:** `return Err(anyhow!("invalid class magic"))`
- **Suggested:** `return Err(anyhow!("invalid class magic: expected 0xCAFEBABE, got {:#010X}", magic))`

**14. `"unsupported constant pool tag {tag}"` — missing offset context**

- **File:** `src/analysis/classfile_evidence.rs:203`
- **Current:** `return Err(anyhow!("unsupported constant pool tag {tag}"))`
- **Suggested:** `return Err(anyhow!("unsupported constant pool tag {tag} at index {cp_index} (offset {offset})"))`

**15. `"offset overflow while parsing class file"` — missing the specific offset and length**

- **File:** `src/analysis/classfile_evidence.rs:244`
- **Suggested:** Include the offset and requested length in the error message.

**16. `"unexpected end of class file while parsing constant pool"` — missing file size and offset**

- **File:** `src/analysis/classfile_evidence.rs:246-248`
- **Suggested:** Include the current offset and total file size: `"unexpected end of class file at offset {end} (file size: {} bytes)"`

---

## Patterns Worth Noting

### Good patterns already in use:
- ✅ `.with_context(|| format!(...))` for adding path info to I/O errors
- ✅ Structured `tracing::warn!` with key-value pairs in the retry loop
- ✅ `AppError` with typed constructors (`bad_request`, `not_found`, `internal`, `payload_too_large`)
- ✅ Proper HTTP status code mapping in `main.rs`

### Anti-patterns to address:
- ❌ `anyhow::bail!()` without the relevant input that triggered the error
- ❌ Error messages that describe what happened but not what to do about it
- ❌ Debug-level logging for potentially security-relevant events (skipped archive entries)
- ❌ Missing truncation for potentially large payloads in error messages

---

## Recommendations Summary

| Priority | Finding | File | Action |
|----------|---------|------|--------|
| 🔴 High | Missing upload ID in error | `scan.rs:55` | Include `request.upload_id` |
| 🔴 High | Missing timing in rate limit error | `verdict.rs:470` | Include elapsed time and API call count |
| 🔴 High | Missing response context in AI errors | `verdict.rs:492,884` | Include truncated response preview |
| 🔴 High | Missing actual input in validation error | `lib.rs:482` | Show what was received |
| 🟡 Medium | Missing root cause for archive errors | `archive.rs:36` | Add hint about expected format |
| 🟡 Medium | Generic HTTP client build error | `malwarebazaar.rs:43` | Add TLS hint |
| 🟡 Medium | Silent entry skipping | `archive.rs:128` | Upgrade to `warn!` logging |
| 🟡 Medium | Metadata failures only in findings | `metadata.rs` | Also log at `warn!` level |
| 🟢 Low | Missing magic bytes in class parse error | `classfile_evidence.rs:155` | Show actual vs expected |
| 🟢 Low | Missing offset in constant pool error | `classfile_evidence.rs:203` | Add offset and index |

---

*Generated by [nightshift](https://github.com/Microck/hermes-nightshift-glm) — error-msg-improve task*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nightshift: Error Message Improvement Analysis #23

nightshift: Error Message Improvement Analysis

Summary

Findings

🔴 High Priority — Vague or unactionable error messages

🟡 Medium Priority — Error messages missing operational context

🟡 Medium Priority — Silent failure modes

🟢 Low Priority — Minor improvements

Patterns Worth Noting

Good patterns already in use:

Anti-patterns to address:

Recommendations Summary

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Priority	Finding	File	Action
🔴 High	Missing upload ID in error	`scan.rs:55`	Include `request.upload_id`
🔴 High	Missing timing in rate limit error	`verdict.rs:470`	Include elapsed time and API call count
🔴 High	Missing response context in AI errors	`verdict.rs:492,884`	Include truncated response preview
🔴 High	Missing actual input in validation error	`lib.rs:482`	Show what was received
🟡 Medium	Missing root cause for archive errors	`archive.rs:36`	Add hint about expected format
🟡 Medium	Generic HTTP client build error	`malwarebazaar.rs:43`	Add TLS hint
🟡 Medium	Silent entry skipping	`archive.rs:128`	Upgrade to `warn!` logging
🟡 Medium	Metadata failures only in findings	`metadata.rs`	Also log at `warn!` level
🟢 Low	Missing magic bytes in class parse error	`classfile_evidence.rs:155`	Show actual vs expected
🟢 Low	Missing offset in constant pool error	`classfile_evidence.rs:203`	Add offset and index

nightshift: Error Message Improvement Analysis #23

Description

nightshift: Error Message Improvement Analysis

Summary

Findings

🔴 High Priority — Vague or unactionable error messages

🟡 Medium Priority — Error messages missing operational context

🟡 Medium Priority — Silent failure modes

🟢 Low Priority — Minor improvements

Patterns Worth Noting

Good patterns already in use:

Anti-patterns to address:

Recommendations Summary

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions