Skip to content

fix: cap payload depth and improve serialized arg readability for Hawk#74

Open
pavelzotikov wants to merge 10 commits intomasterfrom
fix/stack-trace-serialization-globals-and-readability
Open

fix: cap payload depth and improve serialized arg readability for Hawk#74
pavelzotikov wants to merge 10 commits intomasterfrom
fix/stack-trace-serialization-globals-and-readability

Conversation

@pavelzotikov
Copy link
Copy Markdown
Contributor

@pavelzotikov pavelzotikov commented Apr 23, 2026

Summary

Hardens Hawk event and stacktrace payloads when frames carry deeply nested, self-referential, or bulky data (e.g. $GLOBALS-shaped arrays in arguments or extra frame fields). Stack arguments stay a list of strings name = serializedValue: serialized values are full JSON from Serializer (pretty-printed, circular keys become [circular]) so they stay json_decode-friendly in the Hawk UI; only parameter names are shortened. Argument count per frame is capped; there is no byte cap on the value side by design (trade-off vs. transport/API limits elsewhere).

What changed

  • Serializer::prepare — max nesting depth; circular array detection ([circular]); output via JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT.
  • EventPayloadBuilder::normalizeBacktrace — allowlisted frame keys; maps raw args → Hawk arguments without copying args into additionalData; transformForJson for extra keys uses depth limit + reference stack (same circular idea as Serializer); arguments list length capped.
  • StacktraceFrameBuilder — caps arguments per frame; formats lines with name-only truncation; no length truncation of the serialized value / lines without = pass through unchanged; removed redundant try/catch around formatting.
  • Tests — normalization, argument count limit, long prebuilt lines (full value preserved), deep additionalData, circular additionalData, serializer depth / circular / long-string behavior.

Harden stack-trace and event payload building when frames include deeply nested or self-referential data (e.g. $GLOBALS): cap recursion depth in Serializer::prepare and EventPayloadBuilder::transformForJson, limit arguments per frame and per-line size, and keep only safe frame keys. Improve readability of serialized argument values for the Hawk UI by using JSON_PRETTY_PRINT and inserting zero-width break opportunities in long string scalars (valid JSON, smaller horizontal overflow). Tests updated to compare structured JSON and to cover long-string soft breaks.
@pavelzotikov pavelzotikov requested a review from neSpecc as a code owner April 23, 2026 14:09
Comment thread src/EventPayloadBuilder.php Outdated
Comment thread src/Serializer.php Outdated
Comment thread src/StacktraceFrameBuilder.php Outdated
@neSpecc neSpecc requested a review from Copilot April 23, 2026 14:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Hawk event/stacktrace payload building against deeply nested and self-referential data while improving how serialized argument values display in the Hawk UI.

Changes:

  • Add max-depth caps to Serializer::prepare() and EventPayloadBuilder::transformForJson() to prevent runaway recursion.
  • Limit stacktrace argument count/line length and prevent raw args from being copied into additionalData.
  • Switch serializer output to JSON_PRETTY_PRINT and insert soft wrap opportunities into long scalar strings; update/add unit tests accordingly.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/Serializer.php Adds max-depth protection and soft-break insertion for long strings; uses JSON_PRETTY_PRINT.
src/EventPayloadBuilder.php Normalizes frames more defensively, formats/limits arguments, and depth-limits additional data transformation.
src/StacktraceFrameBuilder.php Adds argument count limiting and truncates long “name = value” lines.
tests/Unit/SerializerTest.php Updates assertions to compare decoded JSON structures and adds coverage for soft-break insertion + depth truncation.
tests/Unit/EventPayloadBuilderTest.php Adds targeted tests for backtrace normalization behavior (args handling, limits, truncation, deep additional data).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Serializer.php Outdated
return implode($zwsp, $parts);
}

return implode($zwsp, str_split($value, $chunk));
Comment thread src/StacktraceFrameBuilder.php Outdated
Comment on lines 295 to 297
return substr($line, 0, self::MAX_ARGUMENT_LINE_BYTES - 3) . '...';
}

Comment thread src/EventPayloadBuilder.php Outdated
Comment on lines +269 to +273
if (strlen($line) <= $maxBytes) {
return $line;
}

return substr($line, 0, $maxBytes - 3) . '...';
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trade-off: Serialized argument values are intentionally not byte-capped so lines stay full valid JSON for tooling; size is mitigated by argument count, depth, and circular markers; stricter limits belong at the API/transport layer or an optional SDK setting.

Comment thread src/EventPayloadBuilder.php Outdated
Comment on lines +269 to +273
if (strlen($line) <= $maxBytes) {
return $line;
}

return substr($line, 0, $maxBytes - 3) . '...';
- Stop truncating the value side of `name = serializedValue`; only cap
  parameter names (MAX_ARGUMENT_NAME_BYTES).
- Prebuilt argument lines without ` = ` pass through untouched.
- Remove MAX_ARGUMENT_LINE_BYTES / MAX_ARGUMENT_VALUE_BYTES; update tests.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Hawk event/stacktrace payload generation against deeply nested and cyclic data structures, while reshaping stack argument formatting to better match Hawk’s expected arguments string-list contract.

Changes:

  • Add depth/cycle protection to Serializer::prepare() and switch serializer output to pretty-printed JSON.
  • Update EventPayloadBuilder::normalizeBacktrace() to map raw args into Hawk arguments strings and keep unknown frame fields in additionalData with depth limiting.
  • Extend StacktraceFrameBuilder with argument count limits and name-side truncation helpers; add tests for these behaviors.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/Serializer.php Adds max-depth and circular detection in prepare(), and uses JSON_PRETTY_PRINT output.
src/EventPayloadBuilder.php Normalizes backtraces by mapping argsarguments, skipping raw args in additionalData, and depth-limiting JSON transforms.
src/StacktraceFrameBuilder.php Caps argument count per frame and introduces helpers to truncate argument names / normalize prebuilt lines.
tests/Unit/SerializerTest.php Updates assertions to compare decoded JSON and adds tests for depth, circular arrays, and long scalar handling.
tests/Unit/EventPayloadBuilderTest.php Adds coverage for normalizeBacktrace() argument mapping, count limit, prebuilt-line handling, and deep additional data.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 283 to 287
if (is_array($value)) {
$result = [];
foreach ($value as $k => $v) {
$result[$k] = $this->transformForJson($v);
$result[$k] = $this->transformForJson($v, $depth + 1);
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transformForJson() only enforces a max depth; it does not detect circular/self-referential arrays. For values like $GLOBALS (or any array with a self-reference), this will repeatedly re-walk the same structure until the depth limit and can still produce a very large nested payload and high CPU. Consider adding circular reference detection (similar to Serializer::prepare()), returning a sentinel like [circular] when an ancestor is encountered.

Copilot uses AI. Check for mistakes.
Comment on lines +234 to +238
/**
* Build Hawk `arguments`: string list like "name = serializedValue" (from raw `args` via Serializer).
* Limits the number of lines ({@see StacktraceFrameBuilder::MAX_FRAME_ARGUMENTS}). Serialized values are not
* length-truncated; only param names are capped ({@see StacktraceFrameBuilder::formatTruncatedArgumentLine});
* prebuilt strings are split on the first `" = "` with {@see StacktraceFrameBuilder::truncatePrebuiltArgumentLine}.
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions capping/truncating each stack argument line by byte length, but buildArgumentsList() explicitly keeps serialized values unbounded and the tests preserve very long prebuilt argument lines. This can still produce oversized events. Either implement a per-line byte cap (for both prebuilt arguments and formatted args) or update the PR description/contract expectations to match the current behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +292 to +297
public static function formatTruncatedArgumentLine(string $name, string $serializedValue): string
{
$namePart = self::truncateUtf8StringToMaxBytes($name, self::MAX_ARGUMENT_NAME_BYTES);

return $namePart . ' = ' . $serializedValue;
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatTruncatedArgumentLine() only truncates the argument name; the serialized value (and therefore the full "name = value" line) can still be arbitrarily large (e.g., very long strings/arrays), which can defeat the goal of keeping payloads small. Consider enforcing an explicit max byte length for the entire line and applying it consistently to both formatted and prebuilt argument lines.

Copilot uses AI. Check for mistakes.
Comment on lines +301 to +308
* Lines without `" = "` are returned as-is (no length limit).
*/
public static function truncatePrebuiltArgumentLine(string $line): string
{
$separator = ' = ';
$position = strpos($line, $separator);
if ($position === false) {
return $line;
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncatePrebuiltArgumentLine() returns lines without the " = " delimiter unchanged (no length cap). That allows a single prebuilt argument entry to be extremely large and bypass any payload-size protections. Consider applying the same max-byte truncation to the whole line even when the delimiter is missing.

Suggested change
* Lines without `" = "` are returned as-is (no length limit).
*/
public static function truncatePrebuiltArgumentLine(string $line): string
{
$separator = ' = ';
$position = strpos($line, $separator);
if ($position === false) {
return $line;
* Lines without `" = "` are truncated as whole lines to the configured byte cap.
*/
public static function truncatePrebuiltArgumentLine(string $line): string
{
$separator = ' = ';
$position = strpos($line, $separator);
if ($position === false) {
return self::truncateUtf8StringToMaxBytes($line, self::MAX_ARGUMENT_NAME_BYTES);

Copilot uses AI. Check for mistakes.
Comment thread src/StacktraceFrameBuilder.php Outdated
Comment on lines 279 to 283
try {
$newArguments[] = sprintf('%s = %s', $name, $value);
$newArguments[] = self::formatTruncatedArgumentLine((string) $name, $value);
} catch (\Exception $e) {
// Ignore unknown types
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try/catch around formatTruncatedArgumentLine() appears unnecessary: neither serializeValue() nor formatTruncatedArgumentLine() throws, so the catch block should be unreachable. Removing it would simplify the control flow (or, if an exception is expected, catch a specific exception from the actual throwing call).

Copilot uses AI. Check for mistakes.
…ry/catch in stack args

- Mirror Serializer-style reference stack for additionalData arrays ([circular]).
- Remove unreachable try/catch around formatTruncatedArgumentLine.
- Add test for self-referential custom frame data.
@neSpecc neSpecc requested a review from Copilot May 5, 2026 08:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/StacktraceFrameBuilder.php Outdated
Comment on lines +330 to +333
return substr($string, 0, $cutLength) . '...';
}

/**
Replace raw substr fallback with a valid-UTF-8 byte prefix so json_encode cannot fail on split codepoints; add StacktraceFrameBuilder tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants