Replace grand-central Ingress with HTTPRoute and Traefik Middlewares#837
Conversation
3ab31e1 to
bf3789f
Compare
bf3789f to
6946339
Compare
|
Caught some errors on dev on /auth and /health endpoints: According to the W3C and MDN web specs, the Access-Control-Allow-Origin header can only contain a single origin, the wildcard *, or null. It cannot accept a comma-separated list of multiple origins. When you pass a list, browsers reject it as an invalid value, causing the CORS block. |
juanpardo
left a comment
There was a problem hiding this comment.
Added a few comments. It looks good
| _HTTPROUTE_PLURAL = "httproutes" | ||
|
|
||
| _GC_GATEWAY_NAME: str = "traefik" | ||
| _GC_GATEWAY_NAMESPACE: str = "traefik" |
| Scale the grand-central Deployment to 0 (suspend) or 1 (start) and | ||
| manage its routing resources accordingly. | ||
|
|
||
| On suspend, the Deployment is scaled to 0 and the active routing resource |
There was a problem hiding this comment.
This would still not delete the DNS entry, right? I hope not because that way we can avoid the DNS propagation time when resuming. CC @goat-ssh
There was a problem hiding this comment.
| Record | Resource | How DNS is created |
|---|---|---|
*.aks1.eastus.azure.cratedb-dev.net → CNAME |
IngressRouteTCP (CrateDB cluster) | CNAME to the regional Traefik LB hostname |
*.gc.aks1.eastus.azure.cratedb-dev.net → A |
HTTPRoute (grand-central) | external-dns creates an A record directly from the Gateway's resolved IP |
Note: Both records resolve to the same Traefik load balancer IP: 51.8.42.241
so, yes the tenant grand central DNS is deleted on suspend.
There was a problem hiding this comment.
I think it's better to keep the current behaviour (delete routing resources on suspend) for consistency with how the nginx Ingress is handled on suspend already - both are deleted and recreated on resume. In practice our clusters are up and healthy in under 2 minutes on resume, so the DNS wait is not a significant concern.
2c867a7 to
b4041f1
Compare
6f7be82 to
d9c98a6
Compare
|
@tomach another angle about proper ordering of the Traefik Middlewares: filters:
# 1. GUARD: Drop unauthorized traffic immediately to save resources
- extensionRef:
group: traefik.io
kind: Middleware
name: grand-central-ip-allowlist
type: ExtensionRef
# 2. SANITIZE/TRANSFORM: Apply security headers and buffering
- responseHeaderModifier:
set:
- name: X-Frame-Options
value: DENY
# ... (rest of your headers)
type: ResponseHeaderModifier
- extensionRef:
group: traefik.io
kind: Middleware
name: grand-central-cors
type: ExtensionRef
- extensionRef:
group: traefik.io
kind: Middleware
name: grand-central-buffering
type: ExtensionRef
# 3. OPTIMIZE: Compression should be last to act on the finalized response payload
- extensionRef:
group: traefik.io
kind: Middleware
name: grand-central-compress-js
type: ExtensionRef |
@goat-ssh Good point, reordered the filters so the IP allowlist runs first as a guard and compression runs last - fc61577 |
fc61577 to
fdff0a9
Compare
|
@tomach all looking good on dev! let's ship it |
fdff0a9 to
b7d642f
Compare
Summary of changes
Extends the existing
exposurefield support to grand-central. Whenspec.cluster.exposure: traefik, grand-central is now exposed through the Gateway API (HTTPRoute) and three Traefik Middlewares instead of an nginxIngress. The defaultloadbalancerpath is unchanged.grand-central.pyadds builders forHTTPRoute,compress-js,buffering, andip-allowlistMiddlewares; addscreate_grand_central_exposure(routing resources only, no deployment/service) and delete helpers for both pathsexposure.py-ChangeExposureSubHandlernow also switches grand-central resources when the exposure field changesoperations.py-suspend_or_start_grand_centraldeletes routing resources on suspend and recreates them on resume, respecting the active exposure modehandle_update_allowed_cidrs.pypatches theip-allowlistMiddleware instead of the Ingress annotation whenexposure=traefikgateway.networking.k8s.io/httproutesandtraefik.io/middlewaresChecklist
CHANGES.rst