ESO:418 updated the enhancement proposal to include the network policy for proxy#1998
ESO:418 updated the enhancement proposal to include the network policy for proxy#1998siddhibhor-56 wants to merge 3 commits into
Conversation
bharath-b-rh
left a comment
There was a problem hiding this comment.
Also, please bring in the changes suggested in analysis doc to here.
| tracking-link: | ||
| - https://issues.redhat.com/browse/ESO-165 | ||
| - https://issues.redhat.com/browse/ESO-70 | ||
| - https://redhat.atlassian.net/browse/RFE-8516 |
There was a problem hiding this comment.
nit: Why not https://issues.redhat.com/browse/ESO-418?
| - As an administrator, I want to configure and manage egress rules for `external-secrets` operands via the operator API or CRDs, so I can control which external services they are allowed to access. | ||
| - As an administrator on a proxy-configured cluster, I want the operator to automatically allow ESO pods to reach the proxy server, so I do not have to manually create egress NetworkPolicies after every install or upgrade. | ||
| - As an administrator who manages proxy egress manually, I want to set `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll: Unmanaged` to disable the operator's automatic proxy egress policy, so I retain full control over proxy traffic rules. | ||
| - As a cluster administrator upgrading from an older release, I want the operator to clean up legacy unprefixed NetworkPolicy objects after it has applied the new prefixed ones, so there are no stale or duplicate policies left behind. |
There was a problem hiding this comment.
| - As a cluster administrator upgrading from an older release, I want the operator to clean up legacy unprefixed NetworkPolicy objects after it has applied the new prefixed ones, so there are no stale or duplicate policies left behind. | |
| - As a developer after upgrading from an older release, I want the operator to clean up legacy unprefixed NetworkPolicy objects after it has applied the new prefixed ones, so there are no stale or redundant policies left behind. |
| - Ensure that metrics collection for all components remains functional. | ||
| - Ensure the API server can communicate with the `webhook` for admission control. | ||
| - Automatically create a proxy-egress allow policy when an effective proxy is configured, controlled by `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll` (default `Managed`). | ||
| - Introduce a stable `eso-sys-` / `eso-user-` naming scheme for all operator-managed NetworkPolicy objects to enable unambiguous ownership and safe pruning. |
There was a problem hiding this comment.
| - Introduce a stable `eso-sys-` / `eso-user-` naming scheme for all operator-managed NetworkPolicy objects to enable unambiguous ownership and safe pruning. | |
| - Introduce a stable naming scheme (eso-sys- / eso-user-) for all operator-managed NetworkPolicy objects. This ensures unambiguous ownership and prevents naming overlaps with default policies, which currently lead to reconciliation conflicts |
| - Ensure the API server can communicate with the `webhook` for admission control. | ||
| - Automatically create a proxy-egress allow policy when an effective proxy is configured, controlled by `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll` (default `Managed`). | ||
| - Introduce a stable `eso-sys-` / `eso-user-` naming scheme for all operator-managed NetworkPolicy objects to enable unambiguous ownership and safe pruning. | ||
| - Migrate legacy unprefixed NetworkPolicy names to the new scheme on upgrade and clean up stale objects without leaving gaps in coverage. |
There was a problem hiding this comment.
| - Migrate legacy unprefixed NetworkPolicy names to the new scheme on upgrade and clean up stale objects without leaving gaps in coverage. | |
| - Handle the migration of legacy, unprefixed NetworkPolicies to the new naming scheme during upgrades. This includes cleaning up stale objects while ensuring no gaps in traffic coverage during the transition. |
|
|
||
| - This enhancement does not propose creating a generic, cluster-wide policy management solution. The policies are specific to `external-secrets`. | ||
| - We are not introducing AdminNetworkPolicy at this stage, as standard NetworkPolicy objects are sufficient for this scope and can be managed directly by the operator. | ||
| - Creating a user-facing option to disable these policies is not in scope, as they represent a baseline security posture. |
There was a problem hiding this comment.
Any reason why this was removed?
There was a problem hiding this comment.
because we can disable the proxy Np through the CR , hence removed it.
There was a problem hiding this comment.
We are not disabling the proxy NP, we are giving a choice whether user want to configure it or operator should. This is done because, the traffic is to an external server, and other default NPs we create are internal components, where we are able to define a better policy, if not stringent.
There was a problem hiding this comment.
Please include this on-goal again, if no other objection.
| - This enhancement does not propose creating a generic, cluster-wide policy management solution. The policies are specific to `external-secrets`. | ||
| - We are not introducing AdminNetworkPolicy at this stage, as standard NetworkPolicy objects are sufficient for this scope and can be managed directly by the operator. | ||
| - Creating a user-facing option to disable these policies is not in scope, as they represent a baseline security posture. | ||
| - `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll` does not affect non-proxy clusters — `getProxyConfiguration()` returns nil there and no proxy policy is ever created regardless of the field value. |
There was a problem hiding this comment.
Why is this a non-goal?
There was a problem hiding this comment.
This can be removed from non-goals
| 4. **Proxy Egress Policy (conditional):** After applying the static operand policies, the reconciler evaluates whether an automatic proxy egress allow policy should be created in the `external-secrets` namespace. The policy is created only when **both** conditions hold: | ||
|
|
||
| * `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll` is `Managed` (the default). | ||
| * `getProxyConfiguration()` returns a non-nil result - i.e., an effective proxy is actually configured on the cluster. |
There was a problem hiding this comment.
Please rephrase accordingly
| * `getProxyConfiguration()` returns a non-nil result - i.e., an effective proxy is actually configured on the cluster. | |
| * Proxy is configured in ExternalSecretsManager or ExternalSecretsConfig or in the OpenShift cluster. |
| egress: | ||
| - ports: | ||
| - protocol: TCP | ||
| port: <proxy-port> # set at reconcile time from getProxyConfiguration() |
There was a problem hiding this comment.
The EP should also talk about, how the port will be arrived at like when an URL is provided without port, and arriving at 443 as default for https and 80 as default for http.
There was a problem hiding this comment.
Added a section for this
Adds a new 'Proxy Egress NetworkPolicy' context to the E2E suite covering EP #1998 controller behavior: - No proxy configured → eso-sys-proxy-egress-core absent - Proxy + Managed → eso-sys-proxy-egress-core created with correct port - Managed → Unmanaged transition → NP deleted - User-defined NP names get eso-user- prefix in K8s object name - Migration complete annotation set after first reconcile Ref: openshift/enhancements#1998 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…etworkPolicyAllowProxyEgressAll - Move 5 ESC proxy egress tests from onUpdate to onCreate (they lacked required 'updated' field, causing 'Object Kind is missing' errors) - Add logLevel: 1 to expected in ESC onUpdate proxy egress test (API server defaults logLevel when appConfig is present) - Add networkPolicyAllowProxyEgressAll: Managed to 5 ESM test expected blocks where proxy is configured (field defaulted by kubebuilder marker) Fixes 11 CI unit test failures in PR openshift#144. Ref: openshift/enhancements#1998 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
… response The networkPolicyAllowProxyEgressAll invalid-value test triggers two simultaneous validation errors (enum rejection + CEL null check), causing the API server to wrap them in brackets: `is invalid: [err1, err2]`. The previous expectedError included the `is invalid: ` prefix which no longer matches the bracketed form `is invalid: [`. Drop the object-level prefix and match only the field-level error substring, which is present in both single and multi-error responses. Ref: openshift/enhancements#1998 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…ale and proxy egress NPs The EP-1998 controller implementation calls r.Delete on NetworkPolicies in two places: cleanupMigratedNetworkPolicies (pruning stale entries) and createOrApplyProxyEgressNetworkPolicy (removing the policy when proxy is not configured). Without delete permission, these calls fail with 403, causing reconcile to abort before reaching createOrApplyDeployments. This blocked the env var test from seeing LOG_LEVEL applied to the deployment. Ref: openshift/enhancements#1998 Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
| - As a `external-secrets` user, I need assurance that applying security policies will not break core functionalities like secret management or webhook validation. | ||
| - As an administrator, I want to configure and manage egress rules for `external-secrets` operands via the operator API or CRDs, so I can control which external services they are allowed to access. | ||
| - As an administrator on a proxy-configured cluster, I want the operator to automatically allow ESO pods to reach the proxy server, so I do not have to manually create egress NetworkPolicies after every install or upgrade. | ||
| - As an administrator who manages proxy egress manually, I want to set `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll: Unmanaged` to disable the operator's automatic proxy egress policy, so I retain full control over proxy traffic rules. |
There was a problem hiding this comment.
Any reason why we want name field with the desired effect? The desired effect should be the value instead.
| - As an administrator who manages proxy egress manually, I want to set `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll: Unmanaged` to disable the operator's automatic proxy egress policy, so I retain full control over proxy traffic rules. | |
| - As an administrator who manages proxy egress manually, I want to set `spec.appConfig.proxy.networkPolicy: Unmanaged` to disable the operator's automatic proxy egress policy, so I retain full control over proxy traffic rules. |
There was a problem hiding this comment.
I thought we can have some different name to avoid the confusion,as we already have a field called NetworkPolicies []NetworkPolicy in the API section.
If you suggest to change , i am fine with that also.
There was a problem hiding this comment.
Since it's already listed under Proxy, I feel it does give out that meaning that NP is for proxy and how it should be handled. If you feel it could cause confusion, how about NetworkPolicyProvisioning: [Managed, Unmanaged]?
There was a problem hiding this comment.
okay as its under the proxy section have updated it with networkPolicy
There was a problem hiding this comment.
On second thought, NetworkPolicyProvisioning seems more intuitive.
|
|
||
| - This enhancement does not propose creating a generic, cluster-wide policy management solution. The policies are specific to `external-secrets`. | ||
| - We are not introducing AdminNetworkPolicy at this stage, as standard NetworkPolicy objects are sufficient for this scope and can be managed directly by the operator. | ||
| - Creating a user-facing option to disable these policies is not in scope, as they represent a baseline security posture. |
There was a problem hiding this comment.
We are not disabling the proxy NP, we are giving a choice whether user want to configure it or operator should. This is done because, the traffic is to an external server, and other default NPs we create are internal components, where we are able to define a better policy, if not stringent.
| 4. **Proxy Egress Policy (conditional):** After applying the static operand policies, the reconciler evaluates whether an automatic proxy egress allow policy should be created in the `external-secrets` namespace. The policy is created only when **both** conditions hold: | ||
|
|
||
| * `spec.appConfig.proxy.networkPolicyAllowProxyEgressAll` is `Managed` (the default). | ||
| * * Proxy is configured in `ExternalSecretsManager` or `ExternalSecretsConfig` or in the OpenShift cluster. |
There was a problem hiding this comment.
nit
| * * Proxy is configured in `ExternalSecretsManager` or `ExternalSecretsConfig` or in the OpenShift cluster. | |
| * Proxy is configured in `ExternalSecretsManager` or `ExternalSecretsConfig` or in the OpenShift cluster. |
| * Programmatically-built system policies : also use the `eso-sys-<stable-name>` prefix (e.g., `eso-sys-proxy-egress-core`). These are assembled in Go at reconcile time because they embed runtime data (e.g., the proxy port from `getProxyConfiguration()`). | ||
| * User-defined policies from `spec.networkPolicies[]`: the operator prepends `eso-user-` to the logical name in the CR. The user writes `name: allow-external-secrets-egress`; the resulting Kubernetes object is named `eso-user-allow-external-secrets-egress`. | ||
|
|
||
| 6. **Migration and prune:** `cleanupMigratedNetworkPolicies()` runs only after the apply step fully succeeds. On the first reconcile after upgrade (no `migration-complete` annotation on the CR), it lists all NetworkPolicies in the namespace by label (`app.kubernetes.io/managed-by` + `app.kubernetes.io/part-of`) and by name, diffs against the already-applied desired set, and deletes any whose name is not in the desired set — this covers legacy unprefixed names as well as any stale user NPs. Once all deletions succeed, a `migration-complete` annotation is written to the `ExternalSecretsConfig` CR using the `fieldOwner` option. On subsequent reconciles the full deletion loop is skipped, but the label-based list and diff still runs every reconcile to catch newly stale entries (e.g. a user removes an NP from the CR). This function will be removed after 3 releases, by which point every cluster will have been reconciled at least once under the new naming scheme. |
There was a problem hiding this comment.
Migration is broader and I think it could mislead users when the annotation is noticed.
There was a problem hiding this comment.
updated with externalsecretsconfig.operator.openshift.io/skip-np-cleanup-check: true
4d597fa to
8d4727f
Compare
| 6. **Migration and prune:** `cleanupMigratedNetworkPolicies()` runs only after the apply step fully succeeds. On each reconcile it operates as follows: | ||
| 1. Check whether the annotation `externalsecretsconfig.operator.openshift.io/skip-np-cleanup-check: "true"` is present on the `ExternalSecretsConfig` CR. If the annotation is present, skip the full deletion loop entirely. | ||
| 2. If the annotation is absent, list all NetworkPolicy resources in the namespace by label (`app.kubernetes.io/managed-by: external-secrets-operator` and `app.kubernetes.io/part-of: external-secrets-operator`) and by name, and delete any whose name is not in the desired set. The legacy unprefixed policies — `deny-all-traffic`, `allow-to-dns`, `allow-api-server-egress-for-main-controller`, etc. — carry the operator labels but are no longer in the desired set, so they get cleaned up. Stale user NPs are also removed by the same diff to avoid duplicate NPs. | ||
| 3. After all deletions succeed, write the `externalsecretsconfig.operator.openshift.io/skip-np-cleanup-check: "true"` annotation onto the `ExternalSecretsConfig` CR using the `fieldOwner` option so subsequent reconciles skip the deletion loop and avoid unnecessary GETs.On subsequent reconciles (annotation present), the full deletion loop is skipped, but the label-based list and diff still runs every reconcile to catch newly stale entries (e.g. a user removes an NP from the CR). |
There was a problem hiding this comment.
This is not true correct? Currently we don't allow removing policies, so do we really do perform this?
On subsequent reconciles (annotation present), the full deletion loop is skipped, but the label-based list and diff still runs every reconcile to catch newly stale entries (e.g. a user removes an NP from the CR).
There was a problem hiding this comment.
True, updated
| // NetworkPolicyEgressManagement controls whether the operator manages the proxy egress NetworkPolicy. | ||
| type NetworkPolicyEgressManagement string |
There was a problem hiding this comment.
| // NetworkPolicyEgressManagement controls whether the operator manages the proxy egress NetworkPolicy. | |
| type NetworkPolicyEgressManagement string | |
| // ManagementState controls whether the operator manages the resource lifecycle. | |
| type ManagementState string |
| const ( | ||
| // NetworkPolicyEgressManaged means the operator automatically creates and reconciles | ||
| // the eso-sys-proxy-egress-core NetworkPolicy when a proxy is configured. | ||
| NetworkPolicyEgressManaged NetworkPolicyEgressManagement = "Managed" | ||
|
|
||
| // NetworkPolicyEgressUnmanaged means the user is responsible for proxy egress; | ||
| // the operator does not create, update, or delete the proxy egress NetworkPolicy. | ||
| NetworkPolicyEgressUnmanaged NetworkPolicyEgressManagement = "Unmanaged" | ||
| ) |
There was a problem hiding this comment.
| const ( | |
| // NetworkPolicyEgressManaged means the operator automatically creates and reconciles | |
| // the eso-sys-proxy-egress-core NetworkPolicy when a proxy is configured. | |
| NetworkPolicyEgressManaged NetworkPolicyEgressManagement = "Managed" | |
| // NetworkPolicyEgressUnmanaged means the user is responsible for proxy egress; | |
| // the operator does not create, update, or delete the proxy egress NetworkPolicy. | |
| NetworkPolicyEgressUnmanaged NetworkPolicyEgressManagement = "Unmanaged" | |
| ) | |
| const ( | |
| // Managed indicates the Operator is responsible for the resource lifecycle. | |
| ManagementStateManaged ManagementState = "Managed" | |
| // Unmanaged indicates the User is responsible for the resource lifecycle. | |
| ManagementStateUnmanaged ManagementState = "Unmanaged" | |
| ) |
| NetworkPolicyEgressUnmanaged NetworkPolicyEgressManagement = "Unmanaged" | ||
| ) | ||
|
|
||
| // ProxyConfig is extended to include the proxy egress NetworkPolicy management field. |
There was a problem hiding this comment.
| // ProxyConfig is extended to include the proxy egress NetworkPolicy management field. |
| // networkPolicy controls whether the operator automatically | ||
| // creates and manages the "eso-sys-proxy-egress-core" NetworkPolicy that allows | ||
| // all ESO pods to reach the cluster proxy server. | ||
| // On clusters without a proxy configured, this field has no effect regardless of value. | ||
| // +kubebuilder:validation:Enum=Managed;Unmanaged | ||
| // +kubebuilder:default=Managed | ||
| // +optional | ||
| NetworkPolicy NetworkPolicyEgressManagement `json:"networkPolicy,omitempty"` |
There was a problem hiding this comment.
| // networkPolicy controls whether the operator automatically | |
| // creates and manages the "eso-sys-proxy-egress-core" NetworkPolicy that allows | |
| // all ESO pods to reach the cluster proxy server. | |
| // On clusters without a proxy configured, this field has no effect regardless of value. | |
| // +kubebuilder:validation:Enum=Managed;Unmanaged | |
| // +kubebuilder:default=Managed | |
| // +optional | |
| NetworkPolicy NetworkPolicyEgressManagement `json:"networkPolicy,omitempty"` | |
| // NetworkPolicy defines the management strategy for the proxy egress rule. | |
| // When set to Managed, the operator automatically provisions and maintains | |
| // a NetworkPolicy allowing traffic to the configured proxy. | |
| // If no proxy is configured, no NetworkPolicy will be created | |
| // regardless of this setting. | |
| // +kubebuilder:validation:Enum=Managed;Unmanaged | |
| // +kubebuilder:default=Managed | |
| // +optional | |
| NetworkPolicy ManagementState `json:"networkPolicy,omitempty"` |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bharath-b-rh The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@siddhibhor-56: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
external-secrets-network-policyenhancement to add proxy egressNetworkPolicysupport;spec.appConfig.proxy.networkPolicyAllowProxyEgressAllcontrols automatic creation ofeso-sys-proxy-egress-core, built at runtime using the port fromgetProxyConfiguration().spec.appConfig.proxy.networkPolicyAllowProxyEgressAllfield to control whether the operator auto-creates the proxy egress policy withmanagedandunmanagedoptions default tomanaged.cleanupMigratedNetworkPolicies()to prune legacy unprefixedNetworkPolicynames on upgrade, driven by label and name-based listing and a one-time migration-complete annotation on the CR.