Skip to content

Replace grand-central Ingress with HTTPRoute and Traefik Middlewares#837

Merged
tomach merged 1 commit into
masterfrom
ta/gc-ingress-traefik
Jun 15, 2026
Merged

Replace grand-central Ingress with HTTPRoute and Traefik Middlewares#837
tomach merged 1 commit into
masterfrom
ta/gc-ingress-traefik

Conversation

@tomach

@tomach tomach commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary of changes

Extends the existing exposure field support to grand-central. When spec.cluster.exposure: traefik, grand-central is now exposed through the Gateway API (HTTPRoute) and three Traefik Middlewares instead of an nginx Ingress. The default loadbalancer path is unchanged.

  • grand-central.py adds builders for HTTPRoute, compress-js, buffering, and ip-allowlist Middlewares; adds create_grand_central_exposure (routing resources only, no deployment/service) and delete helpers for both paths
  • exposure.py - ChangeExposureSubHandler now also switches grand-central resources when the exposure field changes
  • operations.py - suspend_or_start_grand_central deletes routing resources on suspend and recreates them on resume, respecting the active exposure mode
  • handle_update_allowed_cidrs.py patches the ip-allowlist Middleware instead of the Ingress annotation when exposure=traefik
  • RBAC - adds permissions for gateway.networking.k8s.io/httproutes and traefik.io/middlewares

Checklist

  • Link to issue this PR refers to: https://github.com/crate/cloud/issues/2905
  • Relevant changes are reflected in CHANGES.rst
  • Added or changed code is covered by tests
  • Documentation has been updated if necessary
  • Changed code does not contain any breaking changes (or this is a major version change)

@tomach tomach force-pushed the ta/gc-ingress-traefik branch from 3ab31e1 to bf3789f Compare May 19, 2026 13:00
@tomach tomach marked this pull request as ready for review May 19, 2026 15:34
@tomach tomach requested review from juanpardo and plaharanne May 19, 2026 15:34
@tomach tomach force-pushed the ta/gc-ingress-traefik branch from bf3789f to 6946339 Compare May 20, 2026 06:50
@goat-ssh

Copy link
Copy Markdown
Contributor

Caught some errors on dev on /auth and /health endpoints:

The 'Access-Control-Allow-Origin' header contains multiple values 
'https://console.cratedb-dev.cloud,http://localhost:8000', but only one is allowed.

According to the W3C and MDN web specs, the Access-Control-Allow-Origin header can only contain a single origin, the wildcard *, or null. It cannot accept a comma-separated list of multiple origins. When you pass a list, browsers reject it as an invalid value, causing the CORS block.

@juanpardo juanpardo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments. It looks good

Comment thread crate/operator/exposure.py
_HTTPROUTE_PLURAL = "httproutes"

_GC_GATEWAY_NAME: str = "traefik"
_GC_GATEWAY_NAMESPACE: str = "traefik"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @WalBeh was challenging using this namespace in the meeting. I also think it would be nice if we could use the project namespace (but not sure if it's possible) CC @goat-ssh

Comment thread crate/operator/grand_central.py
Comment thread crate/operator/grand_central.py Outdated
Comment thread crate/operator/grand_central.py Outdated
Comment thread crate/operator/grand_central.py Outdated
Scale the grand-central Deployment to 0 (suspend) or 1 (start) and
manage its routing resources accordingly.

On suspend, the Deployment is scaled to 0 and the active routing resource

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would still not delete the DNS entry, right? I hope not because that way we can avoid the DNS propagation time when resuming. CC @goat-ssh

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Record Resource How DNS is created
*.aks1.eastus.azure.cratedb-dev.netCNAME IngressRouteTCP (CrateDB cluster) CNAME to the regional Traefik LB hostname
*.gc.aks1.eastus.azure.cratedb-dev.netA HTTPRoute (grand-central) external-dns creates an A record directly from the Gateway's resolved IP

Note: Both records resolve to the same Traefik load balancer IP: 51.8.42.241

so, yes the tenant grand central DNS is deleted on suspend.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to keep the current behaviour (delete routing resources on suspend) for consistency with how the nginx Ingress is handled on suspend already - both are deleted and recreated on resume. In practice our clusters are up and healthy in under 2 minutes on resume, so the DNS wait is not a significant concern.

@tomach tomach force-pushed the ta/gc-ingress-traefik branch 2 times, most recently from 2c867a7 to b4041f1 Compare May 22, 2026 14:01
@goat-ssh goat-ssh requested a review from juanpardo June 1, 2026 07:59
@tomach tomach force-pushed the ta/gc-ingress-traefik branch from 6f7be82 to d9c98a6 Compare June 3, 2026 08:08
@goat-ssh

goat-ssh commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

@tomach another angle about proper ordering of the Traefik Middlewares:

filters:
  # 1. GUARD: Drop unauthorized traffic immediately to save resources
  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-ip-allowlist
    type: ExtensionRef

  # 2. SANITIZE/TRANSFORM: Apply security headers and buffering
  - responseHeaderModifier:
      set:
      - name: X-Frame-Options
        value: DENY
      # ... (rest of your headers)
    type: ResponseHeaderModifier

  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-cors
    type: ExtensionRef

  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-buffering
    type: ExtensionRef

  # 3. OPTIMIZE: Compression should be last to act on the finalized response payload
  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-compress-js
    type: ExtensionRef

@tomach

tomach commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@tomach another angle about proper ordering of the Traefik Middlewares:

filters:
  # 1. GUARD: Drop unauthorized traffic immediately to save resources
  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-ip-allowlist
    type: ExtensionRef

  # 2. SANITIZE/TRANSFORM: Apply security headers and buffering
  - responseHeaderModifier:
      set:
      - name: X-Frame-Options
        value: DENY
      # ... (rest of your headers)
    type: ResponseHeaderModifier

  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-cors
    type: ExtensionRef

  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-buffering
    type: ExtensionRef

  # 3. OPTIMIZE: Compression should be last to act on the finalized response payload
  - extensionRef:
      group: traefik.io
      kind: Middleware
      name: grand-central-compress-js
    type: ExtensionRef

@goat-ssh Good point, reordered the filters so the IP allowlist runs first as a guard and compression runs last - fc61577

@tomach tomach force-pushed the ta/gc-ingress-traefik branch from fc61577 to fdff0a9 Compare June 15, 2026 08:29
@goat-ssh

Copy link
Copy Markdown
Contributor

@tomach all looking good on dev! let's ship it

@tomach tomach force-pushed the ta/gc-ingress-traefik branch from fdff0a9 to b7d642f Compare June 15, 2026 13:01
@tomach tomach merged commit a5916ed into master Jun 15, 2026
16 checks passed
@tomach tomach deleted the ta/gc-ingress-traefik branch June 15, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants