Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 20 additions & 56 deletions src/pages/docs/argo-cd/instances/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: src/layouts/Default.astro
pubDate: 2025-09-15
modDate: 2025-12-08
modDate: 2026-06-11
navSection: Argo CD Instances
navTitle: Overview
title: Overview
Expand Down Expand Up @@ -29,6 +29,24 @@ The [Kubernetes agent](/docs/kubernetes/targets/kubernetes-agent) and the [Kuber

:::

## Prerequisites

The gateway makes three outgoing connections. Before installing, make sure all of them are reachable from inside your cluster:

| Destination | Protocol | Port |
| --- | --- | --- |
| Octopus Server REST API | HTTPS | `443` |
| Octopus Server gRPC endpoint | gRPC (HTTP/2) | `8443` by default |
| Argo CD API server | gRPC (in-cluster) | Argo CD service port |

:::div{.warning}
If your Octopus Server sits behind a load balancer, proxy, or firewall, make sure the gRPC port (`8443` by default) is forwarded to Octopus Server and the proxy supports HTTP/2. Forwarding only HTTPS (`443`) is a common cause of installation failure, where the gateway registers successfully but never connects. See [Troubleshooting](/docs/argo-cd/troubleshooting#failed-to-connect-to-octopus) for details.
:::

:::div{.hint}
The gateway holds long-lived gRPC streams and sends a keep-alive every 30 seconds by default. If a load balancer between the cluster and Octopus Server closes idle connections, set its idle timeout to comfortably exceed the keep-alive interval (`gateway.octopus.keepAlive.intervalSeconds`).
:::

## Installing the Octopus Argo CD Gateway

The gateway is installed using [Helm](https://helm.sh) via the [octopusdeploy/octopus-argocd-gateway-chart](https://hub.docker.com/r/octopusdeploy/octopus-argocd-gateway-chart) chart.
Expand Down Expand Up @@ -206,61 +224,7 @@ The Octopus Argo CD gateway Helm chart follows [Semantic Versioning](https://sem

## Troubleshooting

### Argo CD TLS Errors

If your gateway is unable to connect to your Argo CD instance due to TLS errors it is likely due to the certificate that Argo CD is serving traffic with.

#### Self Signed Certificate

If you are getting an error that looks like this:

```text
tls: failed to verify certificate: x509: certificate signed by unknown authority
```

It is most likely due to Argo CD using a self-signed certificate, if it is intended that your certificate is self-signed you can disable certificate verification by doing the following:

Using Helm for existing installation:

```bash
helm upgrade --atomic \
--version "1.0.0" \
--namespace "{{GATEWAY_NAMESPACE}}" \
--reset-then-reuse-values \
--set gateway.argocd.insecure="true" \
--set gateway.argocd.plaintext="false" \
{{EXISTING_HELM_RELEASE_NAME}} \
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
```

:::div{.warning}
By setting `gateway.argocd.insecure="true"`, TLS certificate verification will no longer be performed between the gateway and the Argo CD instance. Make sure this configuration is necessary to avoid potential security issues.
:::

#### No Certificate

If you are running your Argo CD instance without a certificate due to terminating SSL at a load balancer level the gateway will likely fail to connect with the following error:

```text
transport: authentication handshake failed: EOF
```

This is because the gateway is configured by default to require encrypted traffic, if it is intended that you don't have a certificate you can disable encryption between the gateway and Argo CD by doing the following:

```bash
helm upgrade --atomic \
--version "1.0.0" \
--namespace "{{GATEWAY_NAMESPACE}}" \
--reset-then-reuse-values \
--set gateway.argocd.insecure="false" \
--set gateway.argocd.plaintext="true" \
{{EXISTING_HELM_RELEASE_NAME}} \
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
```

:::div{.warning}
By setting `gateway.argocd.plaintext="true"`, all traffic between the gateway and Argo CD will be unencrypted. Make sure this configuration is necessary to avoid potential security issues.
:::
If your gateway is unable to connect to your Argo CD instance or Octopus Server (e.g. due to TLS errors), see [Troubleshooting Argo CD in Octopus](/docs/argo-cd/troubleshooting) for common issues and resolutions.

## Deleting an Octopus Argo CD Gateway

Expand Down
150 changes: 144 additions & 6 deletions src/pages/docs/argo-cd/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: src/layouts/Default.astro
pubDate: 2025-09-15
modDate: 2025-09-15
modDate: 2026-06-11
title: Troubleshooting Argo CD in Octopus
navTitle: Troubleshooting
description: How to resolve configuration issues
Expand Down Expand Up @@ -38,14 +38,45 @@ Resolution:

### Argo CD Gateway install fails initial health check

#### Failed to connect to Octopus

Behavior:

- Install Argo CD Gateway dialog states:
- "Gateway registered with Octopus" was successful
- "Failed to connect to Octopus" and "Failed to connect to Argo CD" both show as failed
- The gateway pod is in a CrashLoopBackoff
- In a Kubernetes viewer (e.g. K9s), the gateway pod logs state "*Gateway failed to connect to Octopus*"
- If installed with `helm install --atomic`, the install fails and rolls back, removing the gateway from the cluster. The registered gateway still appears under **Infrastructure ➜ Argo CD Instances** but will never become healthy

Cause:

- The gateway cannot establish a gRPC connection to Octopus Server. Both "Failed to connect" rows in the dialog are caused by this single problem, not two separate ones
- Registration uses the REST API url (`registration.octopus.serverApiUrl`), while the running gateway connects to a separate gRPC endpoint (`gateway.octopus.serverGrpcUrl`, port `8443` by default), so a successful registration does not mean the gRPC endpoint is reachable

Resolution:

- Confirm port `8443` is open and routed through to Octopus Server. A load balancer, proxy, or firewall that only forwards HTTPS (`443`) is a common cause. Probe it from inside the cluster:

```bash
kubectl run port-check --image=busybox --restart=Never --rm -it -- \
sh -c 'nc -z -w 5 your-octopus-url 8443 && echo REACHABLE || echo UNREACHABLE'
```

- Confirm `gateway.octopus.serverGrpcUrl` points at your Octopus Server's gRPC endpoint, including the port (not the web url)
- If the gateway logs a certificate thumbprint mismatch, confirm `gateway.octopus.serverThumbprint` matches your Octopus Server's certificate thumbprint
- Inspect the gateway pod logs for connection details: `kubectl logs deploy/octopus-argocd-gateway -n <namespace>`
- If the install was rolled back (e.g. `helm install --atomic` failed and cleaned up the cluster), delete the orphaned Argo CD Gateway in Octopus, resolve the connection issue, and re-run the installation

#### Failed to connect to ArgoCD

Behavior:

- Install Argo CD Gateway dialog states:
- "established a connection" was successful
- Health check failed
- The Gateway pod is in a CrashLoopBackoff
- "Gateway registered with Octopus" was successful
- "Failed to connect to Argo CD" show as failed
- In a Kubernetes viewer (e.g. K9s), the gateway pod logs state "*error validating connection to Argo CD*"
- In Octopus, the healthcheck task log contains: "The Argo CD Gateway has not established a gRPC connection to Octopus Server"
- In Octopus when navigating to newly added ArgoCD instance "Gateway connectivity" tab show "Argo CD Connectivity Issues" warning

Cause:

Expand All @@ -54,10 +85,117 @@ Cause:
Resolution:

- Confirm the URL specified for the `gateway.argocd.serverGrpcUrl` matches the expected grpc endpoint of your argo instance (`<servicename>.<namespace>.svc.cluster.local`)
- If your Argo CD instance is using a self-signed certificate ensure `gateway.argocd.insecure` is set to `true`
- If your Argo CD instance is using a self-signed certificate ensure `gateway.argocd.insecure` is set to `true` (see [TLS errors](#argo-cd-gateway-cannot-connect-to-argo-cd-due-to-tls-errors) below)
- If your Argo CD instance is running in "insecure" mode, ensure `gateway.argocd.plaintext` is set to `true` (false otherwise)
- In Octopus, delete the registered Argo CD Gateway, follow all required helm deletion commands, and reinstall

### Argo CD Gateway cannot connect to Argo CD due to TLS errors

If your gateway is unable to connect to your Argo CD instance due to TLS errors it is likely due to the certificate that Argo CD is serving traffic with.

#### Self Signed Certificate

Behavior:

- The gateway is unable to connect to your Argo CD instance
- The gateway pod logs contain:

```text
tls: failed to verify certificate: x509: certificate signed by unknown authority
```

Cause:

- Argo CD is using a self-signed certificate

Resolution:

- Configure the gateway to trust your certificate, as described in [Trusting Certificates](/docs/argo-cd/instances#trusting-certificates)
- Alternatively, if it is intended that your certificate is self-signed, you can disable certificate verification by doing the following:

Using Helm for existing installation:

```bash
helm upgrade --atomic \
--version "1.0.0" \
--namespace "{{GATEWAY_NAMESPACE}}" \
--reset-then-reuse-values \
--set gateway.argocd.insecure="true" \
--set gateway.argocd.plaintext="false" \
{{EXISTING_HELM_RELEASE_NAME}} \
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
```

:::div{.warning}
By setting `gateway.argocd.insecure="true"`, TLS certificate verification will no longer be performed between the gateway and the Argo CD instance. Make sure this configuration is necessary to avoid potential security issues.
:::

#### No Certificate

Behavior:

- The gateway fails to connect to your Argo CD instance
- The gateway pod logs contain:

```text
transport: authentication handshake failed: EOF
```

Cause:

- Your Argo CD instance is running without a certificate (e.g. SSL is terminated at a load balancer), while the gateway is configured by default to require encrypted traffic

Resolution:

- If it is intended that you don't have a certificate, you can disable encryption between the gateway and Argo CD by doing the following:

```bash
helm upgrade --atomic \
--version "1.0.0" \
--namespace "{{GATEWAY_NAMESPACE}}" \
--reset-then-reuse-values \
--set gateway.argocd.insecure="false" \
--set gateway.argocd.plaintext="true" \
{{EXISTING_HELM_RELEASE_NAME}} \
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
```

:::div{.warning}
By setting `gateway.argocd.plaintext="true"`, all traffic between the gateway and Argo CD will be unencrypted. Make sure this configuration is necessary to avoid potential security issues.
:::

## Gateway Connectivity

### Gateway connection drops at regular intervals (load balancer idle timeout)

Behavior:

- The gateway installs and connects successfully, but loses its connection to Octopus Server after every quiet period of the same length (e.g. 60 seconds without activity)
- Deployments with Argo CD steps fail intermittently with gRPC connection errors, and succeed when retried
- The "Gateway connectivity" tab of the Argo CD instance intermittently shows "Unavailable", depending on when the last health check ran
- The gateway pod logs show stream errors followed by an immediate reconnection
- If the load balancer drops connections silently instead of closing them, the logs show failing keep alives (`keep alive check failed - cancelling subscribers` with `DeadlineExceeded` errors) and the gateway pod restart count climbs at a regular cadence

Cause:

- A load balancer or proxy between the gateway and Octopus Server closes connections it considers idle
- The gateway sends a keep alive to Octopus Server every 30 seconds by default to hold the connection open. If the load balancer's idle timeout is shorter than the keep alive interval (or keep alives are disabled), the connection is terminated before the next keep alive is sent

Resolution:

- Increase the idle timeout on your load balancer so it comfortably exceeds the keep alive interval (`gateway.octopus.keepAlive.intervalSeconds`, default 30 seconds)
- Alternatively, reduce the keep alive interval below the load balancer's idle timeout:

```bash
helm upgrade --atomic \
--version "1.0.0" \
--namespace "{{GATEWAY_NAMESPACE}}" \
--reset-then-reuse-values \
--set gateway.octopus.keepAlive.intervalSeconds="15" \
{{EXISTING_HELM_RELEASE_NAME}} \
oci://registry-1.docker.io/octopusdeploy/octopus-argocd-gateway-chart
```

## Application/Project mapping

### No applications are listed on the **Argo CD Instance ➜ Applications** page
Expand Down