diff --git a/CHANGELOG.md b/CHANGELOG.md index 92e31a7..d4b93c5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,8 +6,16 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and ## [Unreleased] +### Added + +- **API-key prefix and IPv4 scrubbing (KD-0887).** The Scrubber now redacts Stripe-style `sk_live_...` keys, AWS `AKIA...` access key IDs, and IPv4 addresses, in addition to the existing JWT / Bearer / BSN / email coverage. +- **Database DSN password scrubbing (KD-0887).** Any `scheme://user:pass@host` credential — not just a fixed list of DB scheme names — has its password redacted (`scheme://user:[REDACTED:dsn-password]@host`), closing the most severe gap named in the KD-0885 audit debrief. +- **`QueryException` / `PDOException` carrier-strip (KD-0887).** A `QueryException` message embeds the full SQL string with bound parameter values interpolated in — free-text data (a name, an address, a care-data note) that is not a regex-able secret shape and previously reached the payload unredacted on the most common database-error path (surfaced by the emmie annexation, which carries NEN 7510 / AVG care data). `ErrorTracker` now replaces any `PDOException`-family message with `class [SQLSTATE x] [driver code y]` before the Scrubber even runs, dropping the SQL and bindings entirely while preserving the fingerprint. +- **`PathNormalizer` username redaction fallback (KD-0887, M-3).** A stack frame whose path does not start with `base_path()` (vendor installed outside the app root, a globally-installed tool) previously leaked the absolute path verbatim, including the OS username. A secondary pass now redacts the username segment of any `/home//` or `/Users//` shape in those frames. + ### Fixed +- **BSN eleven-test (Dutch: elfproef) validation (KD-0887, M-1).** The KD-0885 fix widened the BSN pattern to any run of 9-or-more digits, which resolved the missed-detection leak but over-redacted legitimate 9+-digit IDs (order numbers, invoice IDs, timestamps) as `[REDACTED:bsn]`. BSN candidates — both grouped (`123.456.782`) and bare digit runs — are now validated against the eleven-test checksum before redaction, so only a number that is actually shaped like a real BSN redacts. A digit run is scanned for any contiguous 9-digit window that passes, so a real BSN embedded in a longer number is still caught without redacting the surrounding digits. - **Eager `Dispatcher` injection broke the never-throw invariant.** `ErrorTracker` took `Illuminate\Contracts\Bus\Dispatcher` as a constructor dependency, so resolving the service threw a `BindingResolutionException` *before* `report()`'s try/catch whenever the Bus deferred provider was unresolvable. Because `report()` runs inside the consumer's exception handler, that throw escaped the reportable callback and replaced the original error with `Target [Illuminate\Contracts\Bus\Dispatcher] is not instantiable` (observed in a Laravel 12 app). The bus is now resolved lazily from the container inside `report()`'s guard, so the failure is swallowed and the original error is preserved. ## [0.1.0] — 2026-06-08 diff --git a/CLAUDE.md b/CLAUDE.md index 765897e..ca5e311 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -26,8 +26,8 @@ The client targets the kendo error-events ingestion endpoint shipped by KD-0771: | Class | Responsibility | |---|---| -| `ErrorTracker` | Public surface. `report(\Throwable)`: builds the scrubbed/normalized payload synchronously, then sends inline (sync) or dispatches `ReportErrorJob` (async). `send(array)`: the HTTP POST, swallow-on-failure. Reads config live from the `Config` repository. | -| `Scrubber` | Redacts JWT / Bearer / BSN / email from a string. | +| `ErrorTracker` | Public surface. `report(\Throwable)`: builds the scrubbed/normalized payload synchronously, then sends inline (sync) or dispatches `ReportErrorJob` (async). Any `\PDOException` (including Laravel's `QueryException`, which extends it) gets a carrier-strip first — the message becomes `class [SQLSTATE x] [driver code y]`, dropping the SQL string and bound parameter values before the Scrubber even runs. `send(array)`: the HTTP POST, swallow-on-failure. Reads config live from the `Config` repository. | +| `Scrubber` | Redacts JWT / Bearer / DSN password / API-key prefix / IPv4 / BSN (eleven-test validated) / email from a string. | | `PathNormalizer` | Exact `base_path()` prefix strip of every frame (mirrors `laravel/nightwatch`'s `Location::normalizeFile()`). | | `Jobs\ReportErrorJob` | Async carrier for the already-scrubbed payload. `$tries = 1` (0 retries); `failed()` logs to `error_log`, never requeues. | | `ErrorTrackerServiceProvider` | Auto-discovered. Merges + publishes config; binds `ErrorTracker` + `PathNormalizer` (wired to the app's `base_path()`). | @@ -54,6 +54,6 @@ The client targets the kendo error-events ingestion endpoint shipped by KD-0771: SemVer. Pre-1.0 (`0.x`): minor bumps are treated as breaking (Composer's `^0.x` caret locks at minor). `main` is always release-ready; PRs update `CHANGELOG.md` under `[Unreleased]`; a release PR moves it to a versioned heading and tags the merge commit (`v0.x.y`). Packagist's webhook picks up the tag; `release.yml` re-runs CI and creates the GitHub release. -## Out of scope (v1) +## Out of scope -Client-side coalescing/debounce (server owns dedup + rate limit), per-project custom scrub rules (v1.5), a framework-agnostic core (Laravel-only), a JS/TS client (v1.5+), Sentry shim, context fields (schema-banned), API-key-prefix / IPv4 scrubbing (v1.5). +Client-side coalescing/debounce (server owns dedup + rate limit), per-project custom scrub rules (v1.5), a framework-agnostic core (Laravel-only), a JS/TS client (v1.5+), Sentry shim, context fields (schema-banned), phone numbers / session IDs as scrub patterns (no shape distinct enough to redact without a high false-positive rate), a `RequestException` message-body carrier-strip (raised as a "consider" in the KD-0887 discussion, referencing the war-room Nightwatch-egress recon — no concrete shape/threshold was specified, so it's deferred pending that follow-up rather than guessed at here). diff --git a/README.md b/README.md index bea0718..9836a5c 100644 --- a/README.md +++ b/README.md @@ -118,9 +118,16 @@ Before send, the message and stack trace are scrubbed of the following patterns |---|---| | JWT | `eyJhbGc...` (three base64url segments) | | Bearer token | `Bearer ` | -| BSN (Dutch citizen service number) | a 9-digit run | +| Database DSN password | `mysql://user:pass@host` — only the password is redacted | +| API-key prefix | `sk_live_...`, `AKIA...` | +| IPv4 address | `192.168.1.42` | +| BSN (Dutch citizen service number) | a 9-digit run passing the eleven-test (Dutch: elfproef) checksum | | Email address | `user@example.com` | +BSN candidates are checksum-validated (the eleven-test) before redaction, so an arbitrary 9+-digit ID (an order number, invoice ID, or timestamp) is not falsely redacted, and a real BSN embedded in a longer digit run (e.g. a phone number) is still caught. + +Free-text PII that isn't a fixed secret shape — a name, address, or care-data value embedded in a database error message — is not covered by pattern matching. `QueryException` and `PDOException` are instead handled by a per-exception-type carrier-strip: the message is replaced with just the exception class, SQLSTATE, and driver error code, dropping the SQL string and bound parameter values entirely. + ## Path normalization Each stack frame's absolute path has the app's own `base_path()` stripped (an **exact** prefix removal, mirroring `laravel/nightwatch`'s `Location::normalizeFile()`). The same exception thrown from `/var/www/html/app/Foo.php` and `/home/forge/app/Foo.php` normalizes to the identical `app/Foo.php`, so kendo fingerprints it once regardless of deploy root. diff --git a/src/ErrorTracker.php b/src/ErrorTracker.php index d4d2071..177cc8d 100644 --- a/src/ErrorTracker.php +++ b/src/ErrorTracker.php @@ -8,6 +8,7 @@ use Illuminate\Contracts\Config\Repository as Config; use Illuminate\Contracts\Container\Container; use Illuminate\Http\Client\Factory as HttpFactory; +use PDOException; use ScriptDevelopment\KendoErrorTracker\Jobs\ReportErrorJob; use Throwable; @@ -125,7 +126,7 @@ public function send(array $payload): void */ private function buildPayload(Throwable $throwable): array { - $message = $this->scrubber->scrub($throwable->getMessage()); + $message = $this->scrubber->scrub($this->safeMessage($throwable)); $stackTrace = $this->scrubber->scrub( $this->pathNormalizer->normalize($throwable->getTraceAsString()), ); @@ -145,6 +146,44 @@ private function buildPayload(Throwable $throwable): array return array_filter($payload, static fn(mixed $value): bool => $value !== null); } + /** + * Resolve the message to send: a database-carrier strip for any + * `PDOException` (including Laravel's `QueryException`, which extends + * it), otherwise the exception's own message unchanged (still passed + * through the Scrubber afterwards either way). + * + * A `QueryException` message embeds the full SQL string with bound + * parameter values interpolated in — free-text data (a name, an address, + * a care-data note) that is not a regex-able secret shape and would + * otherwise leak on the most common database-error path. The fingerprint + * (exception class + SQLSTATE + driver error code) survives; the bound + * values do not. + */ + private function safeMessage(Throwable $throwable): string + { + return $throwable instanceof PDOException + ? $this->databaseCarrierMessage($throwable) + : $throwable->getMessage(); + } + + /** + * Build the class + SQLSTATE + driver-error-code fingerprint that + * replaces a database exception's message. `errorInfo` is PDO's + * `[SQLSTATE, driver code, driver message]` triple; `QueryException` + * copies it from its wrapped PDOException. Either piece may be absent + * (mocked/manually-constructed exceptions, or a previous exception that + * was not itself a PDOException), so both fall back to "unknown". + */ + private function databaseCarrierMessage(PDOException $throwable): string + { + $errorInfo = $throwable->errorInfo; + + $sqlState = isset($errorInfo[0]) && is_scalar($errorInfo[0]) ? (string) $errorInfo[0] : 'unknown'; + $driverCode = isset($errorInfo[1]) && is_scalar($errorInfo[1]) ? (string) $errorInfo[1] : 'unknown'; + + return sprintf('%s [SQLSTATE %s] [driver code %s]', $throwable::class, $sqlState, $driverCode); + } + /** * Read a string config value, narrowing the repository's mixed return. * Non-scalar / null values collapse to an empty string. diff --git a/src/PathNormalizer.php b/src/PathNormalizer.php index 48068ef..b51e7f8 100644 --- a/src/PathNormalizer.php +++ b/src/PathNormalizer.php @@ -6,6 +6,7 @@ use const DIRECTORY_SEPARATOR; +use function preg_replace; use function str_replace; /** @@ -18,9 +19,20 @@ * mirroring `laravel/nightwatch`'s `Location::normalizeFile()` — not a guessed * list of common deploy-root prefixes (that is the server's best-effort fallback * for raw-HTTP callers, never the client's). + * + * A frame whose path does NOT start with `base_path()` (vendor installed + * outside the app root, a globally-installed tool) is left otherwise + * untouched by that strip, but still leaks the OS username via `/home//` + * or `/Users//` (M-3). A secondary redaction pass replaces just the + * username segment of those two shapes, everywhere in the trace, after the + * exact-prefix strip runs. */ final readonly class PathNormalizer { + private const string HOME_USERNAME = '#/home/[^/\s]+/#'; + + private const string MAC_USERNAME = '#/Users/[^/\s]+/#'; + private string $prefix; public function __construct(string $basePath) @@ -31,16 +43,22 @@ public function __construct(string $basePath) } /** - * Replace every occurrence of the base-path prefix in the trace string. + * Replace every occurrence of the base-path prefix in the trace string, + * then redact the username segment of any remaining `/home//` or + * `/Users//` path that did not start with the prefix. * * `getTraceAsString()` embeds absolute file paths inline (`#3 /abs/app/Foo.php(10): ...`), * so a global replace of the prefix normalizes every frame at once. Paths * that do not start with the prefix (vendor under a symlinked store, the - * trailing `{main}` marker) are left untouched — exactly nightwatch's - * "return unchanged when the prefix does not match" behavior. + * trailing `{main}` marker) are left otherwise untouched — exactly + * nightwatch's "return unchanged when the prefix does not match" behavior + * — but still pass through the username-redaction fallback below. */ public function normalize(string $trace): string { - return str_replace($this->prefix, '', $trace); + $trace = str_replace($this->prefix, '', $trace); + $trace = (string) preg_replace(self::HOME_USERNAME, '/home/[REDACTED:user]/', $trace); + + return (string) preg_replace(self::MAC_USERNAME, '/Users/[REDACTED:user]/', $trace); } } diff --git a/src/Scrubber.php b/src/Scrubber.php index 4ea7130..f61fc1c 100644 --- a/src/Scrubber.php +++ b/src/Scrubber.php @@ -4,6 +4,9 @@ namespace ScriptDevelopment\KendoErrorTracker; +use function mb_strlen; +use function mb_substr; +use function preg_match; use function preg_replace; use function preg_replace_callback; use function str_replace; @@ -12,20 +15,25 @@ * Redacts PII and secrets from outbound error payloads. * * Scrubbing happens source-side, before send, so the kendo server never - * receives the raw values. The v1 pattern set covers the leaks seen most - * often in exception messages and stack traces: + * receives the raw values. The pattern set covers the leaks seen most often + * in exception messages and stack traces: * * - JWTs (any `eyJ...` base64url triple) * - HTTP `Bearer ` authorization values (and bare reuse of the same * credential value elsewhere in the string) + * - Database DSN passwords (`scheme://user:pass@host`) + * - API-key prefixes (`sk_live_...`, `AKIA...`) + * - IPv4 addresses * - Dutch BSN (9-digit citizen service number, including grouped forms and - * runs embedded in a longer numeric string) + * runs embedded in a longer numeric string), validated with the + * eleven-test (Dutch: elfproef) checksum so an arbitrary 9+-digit ID is + * not falsely redacted * - email addresses (including IDN / unicode local parts) * * Each redaction collapses the match to a fixed `[REDACTED:]` marker so * the scrubbed string stays readable while carrying no recoverable value. * - * Scope boundaries (deferred to v1.5): + * Scope boundaries (deferred beyond v1.5): * - JWT detection requires the conventional `eyJ` base64url header (the encoded * `{"` JSON object start). Non-`eyJ` / truncated tokens are out of scope here; * widening to every JWT shape risks redacting innocuous dotted identifiers. @@ -33,9 +41,13 @@ * space are not fully covered — the unicode class deliberately excludes the * space to avoid swallowing surrounding prose. IDN domains and unicode (but * unquoted) local parts ARE covered. - * - BSN widening is regex-only: it accepts increased over-redaction of legit - * 9-digit IDs. The elfproef (11-proef) validator that resolves both the - * under- and over-detection directions together is deferred to v1.5. + * - Free-text PII (a name, address, or care-data value embedded in a message — + * e.g. a `QueryException`'s bound parameters) is not a regex-able secret + * shape and is out of scope for this scrubber; see the per-exception-type + * carrier-strip in `ErrorTracker::buildPayload()` for that class of leak. + * - Phone numbers and session IDs remain out of scope (no shape distinct + * enough from other numeric/opaque identifiers to redact without a high + * false-positive rate). */ final class Scrubber { @@ -54,19 +66,22 @@ final class Scrubber private const string BEARER = '/Bearer[ \t]+([A-Za-z0-9._~+-]+=*)/i'; /** - * BSN: a 9-digit citizen service number. Two shapes: - * - grouped with `.`, space, or `-` separators (`123.456.782`, - * `123 456 782`, `123-456-782`); - * - any run of 9-or-more consecutive digits, which also catches a 9-digit - * BSN embedded in a longer numeric string (`0612345678`). - * - * NOTE (tradeoff): widening to 9-or-more digits over-redacts legitimate - * 9+-digit IDs. This is intentional for now — the elfproef (11-proef) - * checksum validator that would distinguish a real BSN from an arbitrary - * digit run (resolving BOTH the missed-detection and over-redaction - * directions) is deferred to v1.5. + * DSN userinfo: `scheme://user:pass@`. Only the password (group 2) is + * redacted — the scheme and username carry little on their own, and + * keeping them intact keeps the scrubbed string legible. This is the + * general URI-userinfo shape, not a fixed list of DB scheme names, so it + * also catches non-DB DSNs that embed the same credential shape. */ - private const string BSN = '/\d{3}[.\s-]\d{3}[.\s-]\d{3}|\d{9,}/'; + private const string DSN_PASSWORD = '/([a-zA-Z][a-zA-Z0-9+.-]*:\/\/[^:\/?#\s@]+:)([^@\/?#\s]+)(@)/'; + + /** Stripe-style live secret key prefix. */ + private const string API_KEY_STRIPE = '/\bsk_live_[A-Za-z0-9]{10,}\b/'; + + /** AWS access key ID: `AKIA` followed by exactly 16 uppercase alnum chars. */ + private const string API_KEY_AWS = '/\bAKIA[0-9A-Z]{16}\b/'; + + /** IPv4 address, each octet bounded to 0-255. */ + private const string IPV4 = '/\b(?:(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)\.){3}(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)\b/'; /** * Email address. The `u` flag plus `\p{L}\p{N}` classes cover IDN domains @@ -75,16 +90,27 @@ final class Scrubber */ private const string EMAIL = '/[\p{L}\p{N}._%+-]+@[\p{L}\p{N}.-]+\.[\p{L}]{2,}/u'; + /** BSN candidate: 3-3-3 digits grouped with `.`, space, or `-` separators. */ + private const string BSN_GROUPED = '/\d{3}[.\s-]\d{3}[.\s-]\d{3}/'; + + /** BSN candidate: any run of 9-or-more consecutive digits. */ + private const string BSN_RUN = '/\d{9,}/'; + public function scrub(string $value): string { - // JWT and Bearer first: a JWT can contain an `@`, and a Bearer value - // can contain a 9-digit run — redacting the structured secrets before - // the looser email/BSN patterns avoids leaving a partial token behind. + // Structured secrets first, loosest patterns last: a JWT can contain + // an `@`, and a digit run can border a credential — redacting the + // narrower shapes first avoids leaving a partial secret behind or + // having a looser pattern consume part of a stricter match. $value = (string) preg_replace(self::JWT, '[REDACTED:jwt]', $value); $value = $this->scrubBearer($value); + $value = (string) preg_replace(self::DSN_PASSWORD, '$1[REDACTED:dsn-password]$3', $value); + $value = (string) preg_replace(self::API_KEY_STRIPE, '[REDACTED:api-key]', $value); + $value = (string) preg_replace(self::API_KEY_AWS, '[REDACTED:api-key]', $value); + $value = (string) preg_replace(self::IPV4, '[REDACTED:ip]', $value); $value = (string) preg_replace(self::EMAIL, '[REDACTED:email]', $value); - return (string) preg_replace(self::BSN, '[REDACTED:bsn]', $value); + return $this->scrubBsn($value); } /** @@ -115,4 +141,82 @@ static function(array $matches) use (&$credentials): string { return $value; } + + /** + * Redact BSN candidates that pass the eleven-test (Dutch: elfproef) + * checksum. + * + * Grouped forms are validated whole (all 9 digits must pass); bare digit + * runs are scanned for any contiguous 9-digit window that passes, so a + * real BSN embedded in a longer number (`0612345678`-shaped) is caught + * without redacting the surrounding digits that are not part of it. A + * candidate that fails the checksum is left untouched — resolving the + * KD-0885 M-1 over-redaction of legitimate 9+-digit IDs. + */ + private function scrubBsn(string $value): string + { + $value = (string) preg_replace_callback( + self::BSN_GROUPED, + fn(array $matches): string => $this->elevenTest((string) preg_replace('/\D/', '', $matches[0])) + ? '[REDACTED:bsn]' + : $matches[0], + $value, + ); + + return (string) preg_replace_callback( + self::BSN_RUN, + fn(array $matches): string => $this->redactValidBsnWindows($matches[0]), + $value, + ); + } + + /** + * Scan a run of consecutive digits left to right, redacting each + * non-overlapping 9-digit window that passes the eleven-test checksum and + * leaving every other digit untouched. + */ + private function redactValidBsnWindows(string $run): string + { + $length = mb_strlen($run); + $result = ''; + $i = 0; + + while ($i < $length) { + $window = $length - $i >= 9 ? mb_substr($run, $i, 9) : null; + + if ($window !== null && $this->elevenTest($window)) { + $result .= '[REDACTED:bsn]'; + $i += 9; + + continue; + } + + $result .= $run[$i]; + $i++; + } + + return $result; + } + + /** + * The eleven-test (Dutch: elfproef) checksum: weight digits 9..2 for + * positions 1-8 and -1 for position 9, sum, and require the total to be a + * non-zero multiple of 11. `000000000` satisfies the modulo trivially but + * is not a real BSN, hence the non-zero guard. + */ + private function elevenTest(string $digits): bool + { + if (preg_match('/^\d{9}$/', $digits) !== 1) { + return false; + } + + $sum = 0; + + for ($position = 0; $position < 9; $position++) { + $weight = $position === 8 ? -1 : 9 - $position; + $sum += ((int) $digits[$position]) * $weight; + } + + return $sum !== 0 && $sum % 11 === 0; + } } diff --git a/tests/Feature/DatabaseCarrierStripTest.php b/tests/Feature/DatabaseCarrierStripTest.php new file mode 100644 index 0000000..c8b5895 --- /dev/null +++ b/tests/Feature/DatabaseCarrierStripTest.php @@ -0,0 +1,99 @@ +set('error-tracker.kendo_url', 'https://kendo.test'); + config()->set('error-tracker.project', '7'); + config()->set('error-tracker.token', 'secret-token'); + config()->set('error-tracker.sync', true); +}); + +it('strips the SQL and bound values from a QueryException, keeping only the fingerprint', function(): void { + // The emmie annexation gap (KD-0887 comment): a QueryException message + // embeds the failing SQL WITH bound parameter values interpolated in, so + // free-text care data (a client name) reaches the message string. That is + // not a regex-able secret shape, so the Scrubber cannot catch it — the + // carrier-strip must replace the whole message before the Scrubber runs. + Http::fake(['kendo.test/*' => Http::response('', 202)]); + + $throwable = null; + + try { + DB::insert('insert into missing_table (name) values (?)', ['Jan de Vries']); + } catch (QueryException $exception) { + $throwable = $exception; + } + + expect($throwable)->toBeInstanceOf(QueryException::class); + + // Guard against a false-green: the care-data value and the SQL must be + // present in the RAW message before the carrier-strip runs. + expect($throwable->getMessage()) + ->toContain('Jan de Vries') + ->toContain('missing_table'); + + app(ErrorTracker::class)->report($throwable); + + Http::assertSent(function($request) use ($throwable): bool { + $body = $request->data(); + + expect($body['exception_class'])->toBe($throwable::class); + expect($body['message']) + ->toContain($throwable::class) + ->not->toContain('Jan de Vries') + ->not->toContain('missing_table') + ->not->toContain('insert into'); + + return true; + }); +}); + +it('strips a raw PDOException message the same way, not only Laravel\'s QueryException', function(): void { + Http::fake(['kendo.test/*' => Http::response('', 202)]); + + $throwable = new PDOException("Duplicate entry 'Jan de Vries' for key 'PRIMARY'"); + $throwable->errorInfo = ['23000', 1_062, "Duplicate entry 'Jan de Vries' for key 'PRIMARY'"]; + + app(ErrorTracker::class)->report($throwable); + + Http::assertSent(function($request): bool { + $body = $request->data(); + + expect($body['message']) + ->toBe(PDOException::class . ' [SQLSTATE 23000] [driver code 1062]') + ->not->toContain('Jan de Vries'); + + return true; + }); +}); + +it('falls back to "unknown" when a PDOException carries no errorInfo', function(): void { + Http::fake(['kendo.test/*' => Http::response('', 202)]); + + app(ErrorTracker::class)->report(new PDOException('some pdo failure')); + + Http::assertSent(function($request): bool { + expect($request->data()['message']) + ->toBe(PDOException::class . ' [SQLSTATE unknown] [driver code unknown]'); + + return true; + }); +}); + +it('does not carrier-strip a non-database exception', function(): void { + Http::fake(['kendo.test/*' => Http::response('', 202)]); + + app(ErrorTracker::class)->report(new RuntimeException('plain failure, not a database error')); + + Http::assertSent(function($request): bool { + expect($request->data()['message'])->toBe('plain failure, not a database error'); + + return true; + }); +}); diff --git a/tests/Feature/ReportTest.php b/tests/Feature/ReportTest.php index cb54a2a..471ef64 100644 --- a/tests/Feature/ReportTest.php +++ b/tests/Feature/ReportTest.php @@ -103,12 +103,13 @@ // Capture a real Throwable whose trace string genuinely carries a scrubable // token. getTraceAsString() elides argument values + the message, but embeds // each frame's defining FILE PATH — so the fixture lives at a path containing - // a BSN-shaped token (123456789). See the fixture file for the rationale. + // an eleven-test-valid BSN token (123456782). See the fixture file for the + // rationale. $throwable = captureThrowableFromTokenBearingPath(); // Guard against a false-green: the token must be present BEFORE scrubbing, // otherwise the redaction assertion below would pass trivially. - expect($throwable->getTraceAsString())->toContain('123456789'); + expect($throwable->getTraceAsString())->toContain('123456782'); app(ErrorTracker::class)->report($throwable); @@ -116,7 +117,7 @@ $body = $request->data(); expect($body['stack_trace']) ->toContain('[REDACTED:bsn]') - ->not->toContain('123456789'); + ->not->toContain('123456782'); return true; }); diff --git a/tests/Fixtures/trace-secret-123456789/throw_with_token_in_path.php b/tests/Fixtures/trace-secret-123456782/throw_with_token_in_path.php similarity index 76% rename from tests/Fixtures/trace-secret-123456789/throw_with_token_in_path.php rename to tests/Fixtures/trace-secret-123456782/throw_with_token_in_path.php index e6a8e72..a240063 100644 --- a/tests/Fixtures/trace-secret-123456789/throw_with_token_in_path.php +++ b/tests/Fixtures/trace-secret-123456782/throw_with_token_in_path.php @@ -9,8 +9,9 @@ * previous-exception messages — a token planted as an argument or in the message * never reaches the trace string (which would make a scrub assertion false-green). * It DOES embed the defining *file path* of each frame. This fixture therefore - * lives at a path carrying a scrubable BSN-shaped token (123456789) so the raw - * token genuinely appears in getTraceAsString() before scrubbing runs. + * lives at a path carrying a scrubable, eleven-test-valid BSN token (123456782) + * so the raw token genuinely appears in getTraceAsString() before scrubbing + * runs. */ return (static function(): never { diff --git a/tests/Pest.php b/tests/Pest.php index 96bc0f4..6d6afc0 100644 --- a/tests/Pest.php +++ b/tests/Pest.php @@ -9,13 +9,14 @@ pest()->extend(TestCase::class)->in('Feature'); /** - * Throw + catch a RuntimeException from a fixture file whose path carries a - * BSN-shaped token, so the captured Throwable's getTraceAsString() genuinely - * contains the token (in the frame's file path) before scrubbing runs. + * Throw + catch a RuntimeException from a fixture file whose path carries an + * eleven-test-valid BSN token, so the captured Throwable's getTraceAsString() + * genuinely contains the token (in the frame's file path) before scrubbing + * runs. */ function captureThrowableFromTokenBearingPath(): Throwable { - return captureThrowableFromFixture('trace-secret-123456789'); + return captureThrowableFromFixture('trace-secret-123456782'); } /** diff --git a/tests/Unit/PathNormalizerTest.php b/tests/Unit/PathNormalizerTest.php index 777996d..570f77f 100644 --- a/tests/Unit/PathNormalizerTest.php +++ b/tests/Unit/PathNormalizerTest.php @@ -43,10 +43,52 @@ ->toBe('#0 app/Foo.php(42): App\Services\Foo->bar()'); }); -it('leaves paths that do not start with the base path unchanged', function(): void { +it('leaves a non-matching path with no username shape unchanged', function(): void { $normalizer = new PathNormalizer('/var/www/html'); $trace = '#0 /opt/other/lib/Thing.php(7): unrelated()'; expect($normalizer->normalize($trace))->toBe($trace); }); + +it('redacts the username from a non-matching /home// path', function(): void { + // M-3: a frame whose path does not start with base_path() previously + // leaked the absolute path — including the OS username — verbatim. + $normalizer = new PathNormalizer('/var/www/html'); + + $trace = '#1 /home/forge/.composer/vendor/bin/tool.php(3): call()'; + + expect($normalizer->normalize($trace)) + ->toBe('#1 /home/[REDACTED:user]/.composer/vendor/bin/tool.php(3): call()') + ->not->toContain('forge'); +}); + +it('redacts the username from a non-matching /Users// path', function(): void { + // M-3: the macOS-style home directory shape leaks the same way. + $normalizer = new PathNormalizer('/var/www/html'); + + $trace = '#1 /Users/janedoe/.composer/vendor/bin/tool.php(3): call()'; + + expect($normalizer->normalize($trace)) + ->toBe('#1 /Users/[REDACTED:user]/.composer/vendor/bin/tool.php(3): call()') + ->not->toContain('janedoe'); +}); + +it('redacts usernames in every non-matching frame without double-processing a matching frame', function(): void { + // The app-root frame is stripped via the exact base_path() prefix (no + // /home/ segment survives to redact); the vendor frame under a different + // user's home directory falls through to the username-redaction pass. + $normalizer = new PathNormalizer('/home/forge/app'); + + $trace = implode("\n", [ + '#0 /home/forge/app/app/Services/Foo.php(10): doThing()', + '#1 /home/deploy/.composer/vendor/bin/tool.php(3): call()', + '#2 {main}', + ]); + + expect($normalizer->normalize($trace))->toBe(implode("\n", [ + '#0 app/Services/Foo.php(10): doThing()', + '#1 /home/[REDACTED:user]/.composer/vendor/bin/tool.php(3): call()', + '#2 {main}', + ])); +}); diff --git a/tests/Unit/ScrubberTest.php b/tests/Unit/ScrubberTest.php index 15f65f5..adbd5b8 100644 --- a/tests/Unit/ScrubberTest.php +++ b/tests/Unit/ScrubberTest.php @@ -26,16 +26,30 @@ ->not->toContain('abc123DEF456ghi'); }); -it('redacts a BSN', function(): void { - $out = $this->scrubber->scrub('citizen 123456789 reported'); +it('redacts a valid BSN', function(): void { + // 111222333 passes the eleven-test checksum (a real BSN shape), unlike an + // arbitrary 9-digit run. + $out = $this->scrubber->scrub('citizen 111222333 reported'); expect($out) ->toContain('[REDACTED:bsn]') - ->not->toContain('123456789'); + ->not->toContain('111222333'); +}); + +it('does not redact a 9-digit run that fails the eleven-test checksum', function(): void { + // M-1: the KD-0885 fix over-redacted any 9+-digit run. 123456789 is + // shaped like a BSN but fails the checksum, so it must now pass through + // untouched — this is a legitimate order/invoice/timestamp-shaped ID. + $clean = 'order 123456789 placed'; + + expect($this->scrubber->scrub($clean)) + ->toBe($clean) + ->not->toContain('[REDACTED:bsn]'); }); it('redacts a formatted BSN', function(string $formatted): void { // C-1: grouped forms with `.`, space, or `-` separators must redact. + // 123.456.782 / 123 456 782 / 123-456-782 all pass the eleven-test checksum. $out = $this->scrubber->scrub("citizen {$formatted} reported"); expect($out) @@ -47,14 +61,46 @@ 'dashed' => '123-456-782', ]); -it('redacts a 9-digit BSN embedded in a longer numeric run', function(): void { - // C-1: a BSN hidden inside a longer digit string (e.g. a phone number) - // must not escape because it borders other digits. - $out = $this->scrubber->scrub('phone 0612345678 logged'); +it('does not redact a formatted 9-digit group that fails the checksum', function(): void { + // M-1: the grouped form is validated too — a formatted-looking number that + // fails the eleven-test (123.456.789) must not redact. + $clean = 'ref 123.456.789 recorded'; + + expect($this->scrubber->scrub($clean)) + ->toBe($clean) + ->not->toContain('[REDACTED:bsn]'); +}); + +it('redacts a valid BSN embedded in a longer numeric run', function(): void { + // C-1: a real BSN hidden inside a longer digit string must not escape + // because it borders other digits. 123456782 (valid) is embedded starting + // at index 1 of the 10-digit run; the leading "9" is not part of the BSN + // and is left untouched. + $out = $this->scrubber->scrub('id 9123456782 here'); expect($out) - ->toContain('[REDACTED:bsn]') - ->not->toContain('0612345678'); + ->toBe('id 9[REDACTED:bsn] here') + ->not->toContain('123456782'); +}); + +it('does not redact a phone-number-shaped digit run with no valid BSN window', function(): void { + // M-1: 0612345678 is a realistic Dutch mobile number. Neither of its two + // possible 9-digit windows passes the eleven-test checksum, so — unlike + // the KD-0885 bare-regex behavior — it must now pass through untouched. + $clean = 'phone 0612345678 logged'; + + expect($this->scrubber->scrub($clean)) + ->toBe($clean) + ->not->toContain('[REDACTED:bsn]'); +}); + +it('does not redact a 10-digit run containing no valid BSN window', function(): void { + // M-1: neither 9-digit window of 1234567890 passes the eleven-test checksum. + $clean = 'id 1234567890 here'; + + expect($this->scrubber->scrub($clean)) + ->toBe($clean) + ->not->toContain('[REDACTED:bsn]'); }); it('redacts an email address', function(): void { @@ -83,8 +129,59 @@ ->not->toContain('José@example.com'); }); +it('redacts a database DSN password', function(): void { + $out = $this->scrubber->scrub('connection failed: mysql://root:s3cr3tPass@127.0.0.1:3306/app'); + + expect($out) + ->toContain('mysql://root:[REDACTED:dsn-password]@') + ->toContain('[REDACTED:ip]') + ->not->toContain('s3cr3tPass'); +}); + +it('redacts a DSN password regardless of scheme', function(): void { + $out = $this->scrubber->scrub('redis://default:hunter2@cache.internal:6379/0'); + + expect($out) + ->toContain('redis://default:[REDACTED:dsn-password]@') + ->not->toContain('hunter2'); +}); + +it('redacts a Stripe-style live API key', function(): void { + // Deliberately low-entropy placeholder (not a real key shape) so this + // fixture doesn't trip secret-scanning push protection on the repo. + $out = $this->scrubber->scrub('stripe call failed with key sk_live_FAKEFAKEFAKEFAKEFAKE01'); + + expect($out) + ->toContain('[REDACTED:api-key]') + ->not->toContain('sk_live_FAKEFAKEFAKEFAKEFAKE01'); +}); + +it('redacts an AWS access key id', function(): void { + $out = $this->scrubber->scrub('credentials AKIAIOSFODNN7EXAMPLE rejected'); + + expect($out) + ->toContain('[REDACTED:api-key]') + ->not->toContain('AKIAIOSFODNN7EXAMPLE'); +}); + +it('redacts an IPv4 address', function(): void { + $out = $this->scrubber->scrub('connection refused from 192.168.1.42'); + + expect($out) + ->toContain('[REDACTED:ip]') + ->not->toContain('192.168.1.42'); +}); + +it('does not redact an out-of-range octet as an IPv4 address', function(): void { + $clean = 'value 999.999.999.999 is not an ip'; + + expect($this->scrubber->scrub($clean)) + ->toBe($clean) + ->not->toContain('[REDACTED:ip]'); +}); + it('redacts multiple patterns in one string', function(): void { - $input = 'user alice@corp.nl bsn 987654321 token Bearer xyz789'; + $input = 'user alice@corp.nl bsn 456789017 token Bearer xyz789'; $out = $this->scrubber->scrub($input); @@ -93,7 +190,7 @@ ->toContain('[REDACTED:bsn]') ->toContain('[REDACTED:bearer]') ->not->toContain('alice@corp.nl') - ->not->toContain('987654321') + ->not->toContain('456789017') ->not->toContain('xyz789'); }); @@ -103,18 +200,6 @@ expect($this->scrubber->scrub($clean))->toBe($clean); }); -it('now redacts a 10-digit run that contains a 9-digit BSN', function(): void { - // C-1 behavior change: the widened BSN pattern matches any run of 9-or-more - // digits so a BSN cannot hide inside a longer digit string. This accepts - // increased over-redaction of legit 9+-digit IDs until the v1.5 elfproef - // validator distinguishes a real BSN from an arbitrary digit run. - $out = $this->scrubber->scrub('id 1234567890 here'); - - expect($out) - ->toContain('[REDACTED:bsn]') - ->not->toContain('1234567890'); -}); - it('does not redact a digit run shorter than 9', function(): void { $out = $this->scrubber->scrub('order 12345678 placed');