Problem
When OpenMetadata is configured against AWS OpenSearch Serverless (AOSS), several cluster-level API calls fail with 404 because AOSS is a managed serverless service — it has no concept of cluster nodes, shards, or JVM heap. Three methods in OpenSearchGenericManager call endpoints that AOSS does not implement:
| Method |
Endpoint |
Used for |
clusterStats() |
/_cluster/stats |
Reindexing auto-tune |
nodesStats() |
/_nodes/stats |
JVM/CPU metrics for auto-tune |
getSearchHealthStatus() |
/_cluster/health |
Service health status panel |
The following ERROR is logged repeatedly at runtime:
ERROR [o.o.s.s.o.OpenSearchGenericManager] - Failed to fetch cluster stats
os.org.opensearch.client.transport.TransportException: Request failed with status code '404'
The error is misleading: the cluster is healthy and authentication is working. The 404 is structural — AOSS will never implement these endpoints — not a transient failure.
Additional Impact: Search Service Reported as Unhealthy
getSearchHealthStatus() calls /_cluster/health and returns HEALTHY_STATUS or UNHEALTHY_STATUS via ServicesStatusJobHandler. This result is surfaced in the OpenMetadata UI under Settings → Health Check, and is also available via:
GET /api/v1/system/status
Because /_cluster/health returns 404 on AOSS, the UI reports the search backend as unhealthy even when AOSS is fully functional and serving requests. This is a false negative that will cause operators to incorrectly believe their search backend has a problem.
Detection
AOSS endpoints always follow the pattern <collection-id>.<region>.aoss.amazonaws.com. This makes detection unambiguous without any new configuration:
private boolean isAwsOpenSearchServerless(String host) {
return host != null && host.endsWith(".aoss.amazonaws.com");
}
Alternatively, the existing SEARCH_AWS_SERVICE_NAME=aoss environment variable (already set in the Helm chart when using AOSS) can serve as a secondary signal.
Proposed Fix
1. Gate cluster/node stats calls behind an AOSS check
Skip calls to unsupported endpoints when running against Serverless, falling back to configured defaults which the auto-tune path already supports:
if (!isAwsOpenSearchServerless(searchConfiguration.getHost())) {
fetchClusterStats();
} else {
LOG.debug("Skipping cluster stats fetch — AWS OpenSearch Serverless does not support /_cluster/stats");
}
2. Use GET / as the AOSS health check
/_cluster/health is not available on AOSS. client.info() (GET /) is supported, correctly reflects both connectivity and auth status, and is the appropriate substitute:
private SearchHealthStatus getAossHealthStatus() {
try {
client.info(); // GET / — supported by AOSS
return new SearchHealthStatus(HEALTHY_STATUS);
} catch (Exception e) {
return new SearchHealthStatus(UNHEALTHY_STATUS);
}
}
getSearchHealthStatus() would dispatch to this method when AOSS is detected.
Impact Summary
- Every OpenMetadata deployment using AOSS sees continuous spurious
ERROR log entries, making it harder to identify real errors
- The Settings → Health Check panel incorrectly shows the search backend as unhealthy
- Monitoring/alerting systems that treat any
ERROR log or the health status API as a signal produce false positives
- The reindexing auto-tune feature is silently non-functional on AOSS deployments
Environment
- OpenMetadata: 1.12.4 / 1.12.5
- Search backend: AWS OpenSearch Serverless (AOSS)
- Deployment: EKS with IRSA authentication (
SEARCH_AWS_IAM_AUTH_ENABLED=true, SEARCH_AWS_SERVICE_NAME=aoss)
Problem
When OpenMetadata is configured against AWS OpenSearch Serverless (AOSS), several cluster-level API calls fail with 404 because AOSS is a managed serverless service — it has no concept of cluster nodes, shards, or JVM heap. Three methods in
OpenSearchGenericManagercall endpoints that AOSS does not implement:clusterStats()/_cluster/statsnodesStats()/_nodes/statsgetSearchHealthStatus()/_cluster/healthThe following ERROR is logged repeatedly at runtime:
The error is misleading: the cluster is healthy and authentication is working. The 404 is structural — AOSS will never implement these endpoints — not a transient failure.
Additional Impact: Search Service Reported as Unhealthy
getSearchHealthStatus()calls/_cluster/healthand returnsHEALTHY_STATUSorUNHEALTHY_STATUSviaServicesStatusJobHandler. This result is surfaced in the OpenMetadata UI under Settings → Health Check, and is also available via:Because
/_cluster/healthreturns 404 on AOSS, the UI reports the search backend as unhealthy even when AOSS is fully functional and serving requests. This is a false negative that will cause operators to incorrectly believe their search backend has a problem.Detection
AOSS endpoints always follow the pattern
<collection-id>.<region>.aoss.amazonaws.com. This makes detection unambiguous without any new configuration:Alternatively, the existing
SEARCH_AWS_SERVICE_NAME=aossenvironment variable (already set in the Helm chart when using AOSS) can serve as a secondary signal.Proposed Fix
1. Gate cluster/node stats calls behind an AOSS check
Skip calls to unsupported endpoints when running against Serverless, falling back to configured defaults which the auto-tune path already supports:
2. Use
GET /as the AOSS health check/_cluster/healthis not available on AOSS.client.info()(GET /) is supported, correctly reflects both connectivity and auth status, and is the appropriate substitute:getSearchHealthStatus()would dispatch to this method when AOSS is detected.Impact Summary
ERRORlog entries, making it harder to identify real errorsERRORlog or the health status API as a signal produce false positivesEnvironment
SEARCH_AWS_IAM_AUTH_ENABLED=true,SEARCH_AWS_SERVICE_NAME=aoss)