A tenant-aware quota and rate-limiting backend for SaaS APIs. Three independently deployable .NET services enforce plan limits in real time, persist usage events for billing and audit, and support multi-level tenant hierarchies (root, reseller, customer).
V1: Backend services with HTTP and Kafka APIs. OpenAPI/Scalar is the primary surface. V2 (in development): ASP.NET Core BFF + Angular/TypeScript admin frontend — see Scope V2 (Deutsch) and the V2 issues & milestones.
- Real-time enforcement — sub-10 ms decision API answering is this API key allowed to make this call?
- Multi-strategy counters — fixed window for monthly quotas, token bucket (Redis Lua) for short-term rate limits
- Tenant hierarchy — resellers manage sub-tenants; plan inheritance and constraint validation across the tree
- Resource-based authorization — every operation scoped to the caller's tenant subtree
- Cache invalidation under cascade — hierarchy changes invalidate plan resolutions, hierarchy caches, and authorization decisions across an entire sub-tree (tag-based via Redis sets)
- Event-driven persistence — usage events flow through Kafka for at-least-once delivery to the Usage service
- Plan lifecycle operations — mid-period plan migration, API key rotation with grace period, immediate revocation with cache eviction
- Idempotent usage ingest — event-ID dedup window in the Usage service tolerates Kafka redelivery
- Hot-path latency benchmarks — BenchmarkDotNet measures p50/p95/p100 for cache hit/miss and sustained throughput; measured p95 on cache hit: 109 µs (91× below 10 ms target); per-request latency at 50 concurrent: 4 µs (details)
Three services, three PostgreSQL databases, one Redis, one Redpanda cluster (Kafka API), one Keycloak realm. Orchestrated locally via Docker Compose.
┌──────────────────────────┐
│ API Consumer (ext.) │
└────────────┬─────────────┘
│ API Key
▼
┌──────────────────────────┐
│ Enforcement Service │ ◄── Redis (counters, plan cache, sessions)
│ - Hot-path check API │
│ - Token bucket / win. │
└─────┬──────────────┬─────┘
│ │
HTTP │ │ Kafka: usage.events
(cache miss)│ │
▼ ▼
┌──────────────┐ ┌──────────────────────────┐
│ Plans │ │ Usage Service │
│ Service │ │ - Event persistence │
│ │ │ - Aggregation worker │
└─────┬────────┘ │ - Reports API (admin) │
│ └──────────────────────────┘
│
Kafka: plans.changes (broadcast)
│
└─► Enforcement (cache invalidation)
└─► Usage (denormalized lookups)
| Service | Responsibility | Profile |
|---|---|---|
| Plans | Tenant hierarchy, plans, plan assignments, API keys, lifecycle operations | Low traffic, write-rare, configuration-heavy |
| Enforcement | Hot-path check API, counter and rate-limit state in Redis | Latency-critical, read-heavy, horizontally scalable |
| Usage | Event persistence, aggregation, reports API | Write-heavy on ingest, query-heavy on reports, append-only |
Each service owns its database. Synchronous communication only on the cache-miss path (Enforcement → Plans). Everything else flows asynchronously via Kafka. Each service follows Clean Architecture internally (Domain, Application, Infrastructure, API) — the domain logic in each service (hierarchy invariants, counter semantics, aggregation rules) justifies layered architecture over a slice-based approach.
| Caller | Mechanism | Surface |
|---|---|---|
| External API consumer | API key in header | Enforcement check API |
| Tenant admin | OIDC (Keycloak) → JWT with refresh | Plans admin API, Usage reports API |
| Service-to-service | Internal JWT (or mTLS — see ADR-009) | Enforcement → Plans on cache miss |
Policy-based: role gates (key rotation, reseller creation, plan assignment).
Resource-based: custom IAuthorizationHandler resolves the caller's tenant subtree at request time and validates resource ownership.
Rate limiting per tenant: the same machinery that enforces customer-facing limits also protects the admin APIs.
| Cache | Purpose | Invalidation |
|---|---|---|
| Plan resolution (Enforcement) | API key → tenant + effective plan + limits | TTL + Pub/Sub on plans.changes |
| Quota counters | Fixed-window INCR per tenant per period |
TTL aligned to window expiry |
| Rate-limit buckets | Token bucket state per tenant | Atomic mutation via Lua script |
| Tenant hierarchy | Resolved parent chains | Tag-based cascade on TenantHierarchyChanged |
| Session store | Admin authentication | Standard session TTL |
The non-trivial case is hierarchy cascade: a single TenantHierarchyChanged event invalidates plan resolutions, hierarchy caches, and authorization decisions across an entire sub-tree. See ADR-003 for the tag-based eviction design.
| Concern | Technology |
|---|---|
| Runtime | .NET 10 |
| API | ASP.NET Core Minimal API |
| ORM | EF Core 10, Npgsql |
| Background jobs | .NET Worker Host (BackgroundService) |
| Message broker | Redpanda (Kafka API) |
| Cache / counters / sessions | Redis 8 |
| Auth provider | Keycloak (OIDC) |
| Request dispatch | No mediator — application services injected directly (ADR-005) |
| Logging | Microsoft.Extensions.Logging (structured) |
| Tracing | OpenTelemetry → Jaeger |
| Testing | xUnit, Shouldly, NSubstitute, Testcontainers, BenchmarkDotNet |
| API docs | Microsoft.AspNetCore.OpenApi + Scalar UI (/scalar) |
| Containers | Docker, Docker Compose |
| # | Topic |
|---|---|
| ADR-001 | Tenant hierarchy: adjacency list with bounded depth |
| ADR-002 | Plan inheritance and constraint validation |
| ADR-003 | Cache invalidation: TTL vs. Pub/Sub eviction vs. tag-based cascades |
| ADR-004 | Counter strategy: fixed window plus token bucket |
| ADR-005 | In-house mediator abstraction over MediatR |
| ADR-006 | Service communication: sync HTTP+cache vs. event-driven vs. shared database |
| ADR-007 | Kafka topic design and partitioning |
| ADR-008 | Idempotency for usage events (at-least-once delivery) |
| ADR-009 | Service-to-service authentication: mTLS vs. internal JWT |
All ADRs are in docs/adrs/.
- API gateway / reverse proxy behavior — Enforcement is a check service, not a proxy
- Real billing integration — overage events are produced and persisted; no payment provider wired up
- Multi-region Redis or Kafka replication
- Schema registry for Kafka events — JSON, versioning by convention
- Webhook notifications — reports are pulled
- Full observability stack — OpenTelemetry to Jaeger only (Prometheus/Grafana already demonstrated in Ingestor)
- Admin frontend — moved to V2, now in development (BFF + Angular/TypeScript); see Scope V2
Prerequisites: .NET 10 SDK and Docker
cp .env.example .env # required before the first up
docker compose up -d # builds and starts the full stack (infra + all 3 services)docker compose up -d builds each service image from its Dockerfile and starts the whole
stack. Host ports are the .env defaults below. To run a single service from source against
the Compose infrastructure instead, e.g. dotnet run --project src/Plans/Plans.Api (it listens
on :5000, matching the BFF's PlansClient:BaseUrl).
In Development, Plans.Api applies EF migrations and seeds a small realm-aligned tenant
hierarchy (Root → Acme → Beta, with the fixed tenant_ids the seeded reseller-acme /
customer-beta users carry) on startup — so a fresh stack is demonstrable without a manual
dotnet ef database update or SQL seed. Production applies migrations as an explicit, reviewed
deploy step and never seeds.
| Surface | URL |
|---|---|
| Plans API docs (Scalar) | http://localhost:5000/scalar |
| Enforcement API docs (Scalar) | http://localhost:5001/scalar |
| Usage API docs (Scalar) | http://localhost:5002/scalar |
| Keycloak admin | http://localhost:8080 |
| Jaeger UI | http://localhost:16686 |
A seeded demo realm provides one root admin, one reseller, and two customer tenants for end-to-end manual testing.
In Development, the Plans seed also provisions a demo plan assigned to the Beta customer tenant and
a demo API key for it, so the Enforcement check path works out of the box (it lives in the Plans DB,
not the Keycloak realm — API keys are a Plans domain concept). Only the key's SHA-256 hash is stored, as
for any real key. Send it in the X-Api-Key header:
curl -X POST http://localhost:5001/check -H "X-Api-Key: demo-metricgate-key"
# → {"decision":"allow", ... ,"plan":{"name":"Demo Plan", ...}}Dev-only.
demo-metricgate-keyis a known, seeded credential and is never created outside Development — production neither seeds nor ships it.
The V1 stack above runs without any of this. The steps here are only needed to exercise the V2 admin frontend and its login flow — the BFF, Keycloak OIDC, and the Angular client. They are one-time local setup for a reviewer; after the first run they are not repeated.
The BFF holds the OIDC tokens server-side and issues the browser an httpOnly session cookie (ADR-010). Two consequences of that design require local setup: the session cookies are Secure, so the BFF must be reached over HTTPS even in development; and the OIDC issuer must be identical as seen by the browser and by the BFF container, or token validation fails on an iss mismatch.
One-time setup:
| # | Step | Why |
|---|---|---|
| 1 | Export and trust a dev cert the container can load (commands below) | The BFF serves HTTPS on :7003; without a trusted cert the browser rejects it and the Secure session cookies never persist |
| 2 | Add 127.0.0.1 keycloak to your hosts file (/etc/hosts, or C:\Windows\System32\drivers\etc\hosts) |
The browser and the BFF container must resolve Keycloak under the same name (keycloak:8080) so the token's iss matches what the BFF validates |
Step 1 — export the dev cert to the mounted path, trust it, and make it readable by the container's non-root user:
dotnet dev-certs https -ep ~/.aspnet/https/aspnetapp.pfx -p devcert
dotnet dev-certs https --trust
chmod 644 ~/.aspnet/https/aspnetapp.pfxIn your .env, set BFF_HTTPS_CERT_PASSWORD=devcert (matching the password above) and BFF_PORT=7003
(it must match the realm's redirect origin — see .env.example).
Then:
docker compose --profile v2 up -d # starts the V1 stack plus the BFF| Surface | URL |
|---|---|
| Admin frontend (via BFF) | https://localhost:7003 |
| BFF health | https://localhost:7003/health |
Verify the login round-trip: open https://localhost:7003, log in as the seeded reseller against Keycloak, and confirm you land back in the admin shell authenticated. If the browser returns to the login page instead, the session cookie did not survive — check that step 1's cert is trusted and that you reached the BFF over https, not http.
Two address worlds, kept separate. Host ports (
:7003BFF,:5000–:5002services,:8080Keycloak) are for the browser and for running a service from source viadotnet run. Service names (plans-api:8080,keycloak:8080) are for container-to-container traffic inside Compose. The BFF's external origin (:7003) governs cookie scope and the Keycloak redirect URIs; the Keycloak issuer (keycloak:8080) governs token validation. These are different addresses for different purposes — do not collapse them.
| Project | Scope | Approach |
|---|---|---|
Tests.{Plans,Enforcement,Usage}.Unit |
Counter logic, token bucket, hierarchy resolution, authorization handlers | Pure unit tests, no I/O |
Tests.{Plans,Enforcement,Usage}.Integration |
Full enforcement path with cache hit/miss, plan change propagation, hierarchy cascade invalidation | Testcontainers (PostgreSQL, Redis, Redpanda) |
Tests.Architecture |
Service boundary rules, no cross-service code dependencies, Clean Architecture layer rules | NetArchTest |
Benchmarks.Enforcement |
Hot-path latency (cache hit / cache miss) and sustained throughput (1/10/50 concurrent) | BenchmarkDotNet — see docs/benchmarks.md |
dotnet test MetricGate.slnx -c ReleaseMetricGate uses a two-stage pipeline split across GitHub Actions and Azure DevOps.
GitHub Actions runs on every push and pull request to main, as two parallel jobs that
each build the solution once (no separate build job); a concurrency group cancels superseded
in-flight runs:
- Unit & architecture tests — Plans, Enforcement, Usage unit suites + architecture rules
- Integration tests (Testcontainers) — Plans, Enforcement, Usage integration suites
Azure DevOps is triggered manually via workflow_dispatch after the GitHub Actions jobs pass — never on automatic pushes:
- Docker build (Buildx, multi-stage SDK → ASP.NET runtime) for Plans and Enforcement, with a registry-backed layer cache
- Push to GitHub Container Registry (
ghcr.io/goldbarth/metricgate) - Deploy to Azure Container Apps (
rg-goldbarth-dev) — requires manual approval gate
Push to main
└─► GitHub Actions: unit & architecture tests ┐ (parallel, each builds once)
integration tests ┘
Manual dispatch (when ready to deploy)
└─► GitHub Actions: trigger Azure DevOps
└─► Azure DevOps: buildx (cache) → ghcr.io push → approval gate → container apps deploy
Images are tagged with both the Git SHA (Build.BuildId) for traceability and latest for convenience. Container Apps scale to zero replicas when idle (min-replicas: 0).
The scope and issue docs no longer carry calendar time estimates. AI-assisted development made them meaningless — a planned "two-week phase" routinely landed in a single focused session. This note records measured effort instead.
| Estimate | Basis | |
|---|---|---|
| Without AI — mid-level dev (~2 yr), unaided | ~350–550 h | ~60 issues at a mid-level solo pace, plus learning overhead on the distributed-systems patterns (outbox, idempotency, invalidation cascades, recursive-CTE hierarchy, trace propagation) a dev ~2 years in researches rather than knows cold |
| With AI assistance (measured) | ~38 h focused | commit-timestamp deltas across 10 working sessions |
| Speedup | ~9–14× | raw build time and knowledge access — see note |
How the measured number is derived: ~38 focused hours is the sum of first-to-last commit spans per working day (breaks included) — a proxy for active time, not tracked hours. Calendar span was ~4 weeks (2026-05-05 → 06-02), part-time: development was interleaved with job applications, company research, and coursework, so wall-clock weeks say nothing about effort.
The trade-off. The speedup is raw build time, not learning time. Generating low-level implementations quickly means less hands-on repetition of the wiring-level technical depth — the muscle you build by typing it yourself. The compensating effort shifts upward: more time goes into understanding and validating design patterns and high-level architecture — reading generated code critically, writing the ADRs, and owning the system-design decisions rather than the keystroke-level ones. Part of the gain is also knowledge access: the assistant surfaced idiomatic patterns at the point of need, so the honest reading is that some of the speedup is compressed learning time, not just compressed keystrokes. AI compresses the typing, not the thinking; the engineering value moved from low-level implementation toward architecture and review.
Full evaluation, per-session measurement, and trade-off rationale: Engineering Note — AI-Assisted Development.
- Scope V1 (English)
- Scope V1 (Deutsch)
- Scope V2 (English) — in development
- Scope V2 (Deutsch) — in Entwicklung
- V2 Issues & Milestones
- Architecture Decision Records
- Benchmark results
- Operational runbook
- Engineering Note — AI-Assisted Development