Skip to content

fix: improve aws and azure benchmark cases#11

Merged
yusufozturk merged 7 commits into
mainfrom
fix/improve-aws-azure-bench
Jun 14, 2026
Merged

fix: improve aws and azure benchmark cases#11
yusufozturk merged 7 commits into
mainfrom
fix/improve-aws-azure-bench

Conversation

@erenaslandev

@erenaslandev erenaslandev commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • New Features

    • Added MinIO emulator support, new MinIO correctness/performance cases, and added Filebeat and CRIBL Stream as measured subjects.
  • Tests

    • Added benchmark/config coverage for CRIBL Stream, Filebeat, Fluent Bit, Vector, Logstash and vmetric across Azure Blob, S3/MinIO, TCP, Kinesis, and CloudWatch.
    • Removed Fluent Bit from some cloud cases due to TLS/emulator limitations; enabled emulator-friendly Azure Blob behavior.
  • Chores

    • Updated subject metadata/capabilities, bumped vmetric, improved runner cleanup, and added a one-shot MinIO init.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds CRIBL and Filebeat Azure-blob configs and vmetric tuning; expands subject capabilities and case subject lists; introduces MinIO support (config types, cases, receiver minio-init, bucket creation), and hardens Docker Compose generation and startup behavior.

Changes

Benchmark & runtime updates

Layer / File(s) Summary
Subject registry and case activation
internal/config/subject.go, various cases/*/case.yaml
Update Registry entries (capabilities for filebeat/cribl-stream, vmetric version/entrypoint) and adjust subjects lists across multiple case.yaml files.
Azure Blob → TCP (cribl + filebeat)
cases/azure_blob_to_tcp_performance/*, cases/azure_blob_to_tcp_performance/configs/cribl-stream/*, cases/azure_blob_to_tcp_performance/configs/filebeat.yml, cases/azure_blob_to_tcp_performance/configs/vmetric.yml
Add CRIBL azblob input, passthru routing, TCP JSON output and messages file; add Filebeat azblob polling config sending to receiver; tune vmetric azblob pollers/workers.
TCP → Azure Blob correctness & performance (cribl-stream)
cases/tcp_to_azure_blob_correctness/*, cases/tcp_to_azure_blob_performance/*, cases/*/configs/fluent-bit.conf, cases/*/configs/vmetric.yml
Introduce CRIBL TCP input and Azure Blob outputs with passthru routing, outputs/staging/size/time limits; enable Fluent Bit emulator_mode for Azurite; add azblob queue parallelism for vmetric.
Fluent Bit LocalStack endpoints & vmetric queueing
cases/tcp_to_cloudwatch_performance/configs/fluent-bit.conf, cases/tcp_to_kinesis_performance/configs/fluent-bit.conf, cases/tcp_to_s3_performance/configs/vmetric.yml, cases/tcp_to_s3_performance/configs/vmetric.yml, cases/tcp_to_s3_correctness/configs/vmetric.yml
Switch Fluent Bit LocalStack/Kinesis outputs to endpoint <host> + port; document TLS/certificate caveats; add pollers/workers and queue.parallelism; change S3/vmetric payload format and gzip/part settings.
MinIO cases & per-subject configs
cases/tcp_to_minio_*, cases/tcp_to_minio_performance/*, cases/tcp_to_minio_performance_paced/*
Add new MinIO-backed correctness and performance cases and per-subject configs (Fluent Bit, Logstash, Vector, vmetric) to run TCP→S3 via MinIO with appropriate formatting/compression/queueing and drain settings.
Receiver minio-init and internal MinIO support
containers/receiver/*, internal/config/cloud.go, internal/config/case.go, internal/config/cloud_test.go
Add minio TestCase field and MinioConfig type/constants, validate mutual exclusivity with aws:, implement receiver minio_init one-shot that creates buckets with retries, and add tests for MinioConfig defaults and updated validation errors.
Docker orchestrator reliability improvements
internal/orchestrator/docker.go, internal/orchestrator/awsinit.go
Render MinIO and minio-init compose services and wire depends_on gating, compute and merge MinIO env into compose templates, pin LocalStack env and ulimits, and remove stale bench-* containers before docker compose up.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • VirtualMetric/PipeBench#8: Related work on emulator/cloud-environment plumbing and compose/cloud config handling that this PR extends for MinIO.

Suggested reviewers

  • namles
  • yusufozturk

Poem

🐰 I dug the configs, tuned the streams,
Cribl and Filebeat joined the teams,
MinIO wakes, the buckets hum,
Compose clears paths so tests can run,
A rabbit hops — the benchmarks drum.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: improve aws and azure benchmark cases' accurately reflects the primary changes in the PR, which include adding MinIO/S3 support and Azure Blob enhancements across multiple benchmark cases.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/improve-aws-azure-bench

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cases/tcp_to_cloudwatch_performance/configs/fluent-bit.conf`:
- Around line 17-23: Fluent-bit 5.0's aws_client enforces TLS verification and
blocks LocalStack in the fluent-bit.conf used by the
tcp_to_cloudwatch_performance and tcp_to_kinesis_performance cases; update the
test/benchmark configuration to exclude or skip fluent-bit 5.0 for these cases
(e.g., add a conditional skip in the benchmark matrix or CI job that selects
clients), add an inline comment in fluent-bit.conf referencing the upstream
issue/PR URL and a short explanation, and/or remove the failing fluent-bit.conf
from the active test matrix until upstream provides an insecure-TLS option so
the cases no longer fail the CI.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d936d89-5782-4861-8a47-3a1b726418fd

📥 Commits

Reviewing files that changed from the base of the PR and between 746a987 and c716b77.

📒 Files selected for processing (32)
  • cases/azure_blob_to_tcp_performance/case.yaml
  • cases/azure_blob_to_tcp_performance/configs/cribl-stream/cribl.inited
  • cases/azure_blob_to_tcp_performance/configs/cribl-stream/cribl.yml
  • cases/azure_blob_to_tcp_performance/configs/cribl-stream/inputs.yml
  • cases/azure_blob_to_tcp_performance/configs/cribl-stream/messages.yml
  • cases/azure_blob_to_tcp_performance/configs/cribl-stream/outputs.yml
  • cases/azure_blob_to_tcp_performance/configs/cribl-stream/pipelines/route.yml
  • cases/azure_blob_to_tcp_performance/configs/filebeat.yml
  • cases/azure_blob_to_tcp_performance/configs/vmetric.yml
  • cases/s3_to_tcp_performance/configs/vmetric.yml
  • cases/tcp_to_azure_blob_correctness/case.yaml
  • cases/tcp_to_azure_blob_correctness/configs/cribl-stream/cribl.inited
  • cases/tcp_to_azure_blob_correctness/configs/cribl-stream/cribl.yml
  • cases/tcp_to_azure_blob_correctness/configs/cribl-stream/inputs.yml
  • cases/tcp_to_azure_blob_correctness/configs/cribl-stream/messages.yml
  • cases/tcp_to_azure_blob_correctness/configs/cribl-stream/outputs.yml
  • cases/tcp_to_azure_blob_correctness/configs/cribl-stream/pipelines/route.yml
  • cases/tcp_to_azure_blob_correctness/configs/fluent-bit.conf
  • cases/tcp_to_azure_blob_performance/case.yaml
  • cases/tcp_to_azure_blob_performance/configs/cribl-stream/cribl.inited
  • cases/tcp_to_azure_blob_performance/configs/cribl-stream/cribl.yml
  • cases/tcp_to_azure_blob_performance/configs/cribl-stream/inputs.yml
  • cases/tcp_to_azure_blob_performance/configs/cribl-stream/messages.yml
  • cases/tcp_to_azure_blob_performance/configs/cribl-stream/outputs.yml
  • cases/tcp_to_azure_blob_performance/configs/cribl-stream/pipelines/route.yml
  • cases/tcp_to_azure_blob_performance/configs/fluent-bit.conf
  • cases/tcp_to_azure_blob_performance/configs/vmetric.yml
  • cases/tcp_to_cloudwatch_performance/configs/fluent-bit.conf
  • cases/tcp_to_kinesis_performance/configs/fluent-bit.conf
  • cases/tcp_to_s3_performance/configs/vmetric.yml
  • internal/config/subject.go
  • internal/orchestrator/docker.go

Comment thread cases/tcp_to_cloudwatch_performance/configs/fluent-bit.conf
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 12, 2026

Copy link
Copy Markdown

Deploying pipebench with  Cloudflare Pages  Cloudflare Pages

Latest commit: 9d95c0a
Status: ✅  Deploy successful!
Preview URL: https://53eda6dd.pipebench.pages.dev
Branch Preview URL: https://fix-improve-aws-azure-bench.pipebench.pages.dev

View logs

@erenaslandev

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@erenaslandev erenaslandev changed the title fix: improve awsand azure benchmark cases fix: improve aws and azure benchmark cases Jun 12, 2026
@erenaslandev

Copy link
Copy Markdown
Member Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/orchestrator/docker.go (1)

679-690: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not force-remove running bench-* containers during stale cleanup.

Line [680] collects all matching containers, and Line [690] force-removes them. That includes running containers, which can abruptly kill active runs/debug sessions.

Suggested patch
-	out, err := exec.Command("docker", "ps", "-aq", "--filter", "name=^bench-").Output()
+	out, err := exec.Command(
+		"docker", "ps", "-aq",
+		"--filter", "name=^bench-",
+		"--filter", "status=created",
+		"--filter", "status=exited",
+		"--filter", "status=dead",
+	).Output()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/docker.go` around lines 679 - 690, The cleanup
currently force-removes all containers named bench-* in
removeStaleBenchContainers (using docker ps -aq and docker rm -f), which can
kill running containers; change the docker ps invocation to only list
non-running containers (e.g., add --filter "status=exited" and optionally
--filter "status=created" and --filter "status=dead") and remove the force flag
so removal uses docker rm (replace args := append([]string{"rm", "-f"}, ids...)
with args := append([]string{"rm"}, ids...)); keep the existing early returns
and error handling but ensure you only target stopped/exited/created/dead
bench-* containers to avoid killing active runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cases/tcp_to_minio_correctness/configs/vector.toml`:
- Around line 1-2: The top-of-file header comment currently reads
"tcp_to_minio_performance" but this file is for the tcp_to_minio_correctness
case; update the comment string in the first comment block to
"tcp_to_minio_correctness" so the file header accurately reflects the case name
(edit the header comment in configs/vector.toml accordingly).

In `@containers/receiver/minio.go`:
- Around line 25-27: The current check uses strings.TrimSpace(bucketsEnv) which
allows inputs like "," or " , " to pass but results in zero usable bucket names
later; change the validation to split bucketsEnv by ',' then build a filtered
slice (e.g., iterate over strings.Split(bucketsEnv, ",") and for each entry run
strings.TrimSpace and append only non-empty names to validBuckets), then if
len(validBuckets) == 0 log an error (same message "minio-init:
MINIO_INIT_BUCKETS is required" or similar) and os.Exit(1); update the later
code that iterates over buckets to use this filtered validBuckets slice so the
code never continues successfully when there are no usable bucket names.
- Around line 42-57: The retry budget and request timeouts are wrong: move the
per-bucket deadline inside the loop that iterates buckets so each bucket gets
its own 2-minute retry window (i.e., compute deadline :=
time.Now().Add(2*time.Minute) for each name in the strings.SplitSeq(bucketsEnv,
",") loop), and add a per-request context timeout when calling
client.CreateBucket by creating a child context (e.g., reqCtx, cancel :=
context.WithTimeout(ctx, <shortDuration>) and defer cancel() for each attempt)
and pass reqCtx to client.CreateBucket; keep using bucketExists to decide
success and retain the same retry/sleep logic but compare time.Now() to the
per-bucket deadline.

In `@internal/config/case.go`:
- Around line 331-336: The reserved endpoint-name set in the reserved variable
(in internal/config/case.go) must include Kafka service names to prevent
endpoint/composer collisions; update the reserved map initializer to add
"redpanda" and "redpanda-init" (and optionally other Kafka synonyms like
"kafka"/"kafka-init" if you want broader protection) so endpoint validation will
reject those names the same way it rejects "minio"/"minio-init" and cloud
emulator names.

---

Outside diff comments:
In `@internal/orchestrator/docker.go`:
- Around line 679-690: The cleanup currently force-removes all containers named
bench-* in removeStaleBenchContainers (using docker ps -aq and docker rm -f),
which can kill running containers; change the docker ps invocation to only list
non-running containers (e.g., add --filter "status=exited" and optionally
--filter "status=created" and --filter "status=dead") and remove the force flag
so removal uses docker rm (replace args := append([]string{"rm", "-f"}, ids...)
with args := append([]string{"rm"}, ids...)); keep the existing early returns
and error handling but ensure you only target stopped/exited/created/dead
bench-* containers to avoid killing active runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7504c5d-84cb-43ef-8072-746e84a00882

📥 Commits

Reviewing files that changed from the base of the PR and between c8043c5 and 8f0583f.

📒 Files selected for processing (22)
  • cases/tcp_to_minio_correctness/case.yaml
  • cases/tcp_to_minio_correctness/configs/fluent-bit.conf
  • cases/tcp_to_minio_correctness/configs/logstash.conf
  • cases/tcp_to_minio_correctness/configs/vector.toml
  • cases/tcp_to_minio_correctness/configs/vmetric.yml
  • cases/tcp_to_minio_performance/case.yaml
  • cases/tcp_to_minio_performance/configs/fluent-bit.conf
  • cases/tcp_to_minio_performance/configs/logstash.conf
  • cases/tcp_to_minio_performance/configs/vector.toml
  • cases/tcp_to_minio_performance/configs/vmetric.yml
  • cases/tcp_to_minio_performance_paced/case.yaml
  • cases/tcp_to_minio_performance_paced/configs/fluent-bit.conf
  • cases/tcp_to_minio_performance_paced/configs/logstash.conf
  • cases/tcp_to_minio_performance_paced/configs/vector.toml
  • cases/tcp_to_minio_performance_paced/configs/vmetric.yml
  • containers/receiver/main.go
  • containers/receiver/minio.go
  • internal/config/case.go
  • internal/config/cloud.go
  • internal/config/cloud_test.go
  • internal/orchestrator/awsinit.go
  • internal/orchestrator/docker.go
✅ Files skipped from review due to trivial changes (9)
  • cases/tcp_to_minio_performance/configs/vector.toml
  • cases/tcp_to_minio_performance_paced/configs/vector.toml
  • cases/tcp_to_minio_performance/configs/logstash.conf
  • cases/tcp_to_minio_performance_paced/configs/fluent-bit.conf
  • cases/tcp_to_minio_correctness/case.yaml
  • cases/tcp_to_minio_performance/case.yaml
  • cases/tcp_to_minio_performance_paced/case.yaml
  • cases/tcp_to_minio_performance_paced/configs/vmetric.yml
  • cases/tcp_to_minio_correctness/configs/logstash.conf

Comment thread cases/tcp_to_minio_correctness/configs/vector.toml Outdated
Comment thread containers/receiver/minio.go Outdated
Comment thread containers/receiver/minio.go Outdated
Comment thread internal/config/case.go
@yusufozturk yusufozturk merged commit 16b8738 into main Jun 14, 2026
5 checks passed
@yusufozturk yusufozturk deleted the fix/improve-aws-azure-bench branch June 14, 2026 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants