`OpenSearchGenericManager`: Unsupported `/_cluster/stats` call causes spurious ERROR logs on AWS OpenSearch Serverless (AOSS)

---

## Problem

When OpenMetadata is configured against AWS OpenSearch Serverless (AOSS), several cluster-level API calls fail with 404 because AOSS is a managed serverless service — it has no concept of cluster nodes, shards, or JVM heap. Three methods in `OpenSearchGenericManager` call endpoints that AOSS does not implement:

| Method | Endpoint | Used for |
|--------|----------|---------|
| `clusterStats()` | `/_cluster/stats` | Reindexing auto-tune |
| `nodesStats()` | `/_nodes/stats` | JVM/CPU metrics for auto-tune |
| `getSearchHealthStatus()` | `/_cluster/health` | **Service health status panel** |

The following ERROR is logged repeatedly at runtime:

```
ERROR [o.o.s.s.o.OpenSearchGenericManager] - Failed to fetch cluster stats
os.org.opensearch.client.transport.TransportException: Request failed with status code '404'
```

The error is misleading: the cluster is healthy and authentication is working. The 404 is structural — AOSS will never implement these endpoints — not a transient failure.

## Additional Impact: Search Service Reported as Unhealthy

`getSearchHealthStatus()` calls `/_cluster/health` and returns `HEALTHY_STATUS` or `UNHEALTHY_STATUS` via `ServicesStatusJobHandler`. This result is surfaced in the OpenMetadata UI under **Settings → Health Check**, and is also available via:

```
GET /api/v1/system/status
```

Because `/_cluster/health` returns 404 on AOSS, the UI reports the search backend as **unhealthy** even when AOSS is fully functional and serving requests. This is a false negative that will cause operators to incorrectly believe their search backend has a problem.

## Detection

AOSS endpoints always follow the pattern `<collection-id>.<region>.aoss.amazonaws.com`. This makes detection unambiguous without any new configuration:

```java
private boolean isAwsOpenSearchServerless(String host) {
    return host != null && host.endsWith(".aoss.amazonaws.com");
}
```

Alternatively, the existing `SEARCH_AWS_SERVICE_NAME=aoss` environment variable (already set in the Helm chart when using AOSS) can serve as a secondary signal.

## Proposed Fix

### 1. Gate cluster/node stats calls behind an AOSS check

Skip calls to unsupported endpoints when running against Serverless, falling back to configured defaults which the auto-tune path already supports:

```java
if (!isAwsOpenSearchServerless(searchConfiguration.getHost())) {
    fetchClusterStats();
} else {
    LOG.debug("Skipping cluster stats fetch — AWS OpenSearch Serverless does not support /_cluster/stats");
}
```

### 2. Use `GET /` as the AOSS health check

`/_cluster/health` is not available on AOSS. `client.info()` (`GET /`) is supported, correctly reflects both connectivity and auth status, and is the appropriate substitute:

```java
private SearchHealthStatus getAossHealthStatus() {
    try {
        client.info(); // GET / — supported by AOSS
        return new SearchHealthStatus(HEALTHY_STATUS);
    } catch (Exception e) {
        return new SearchHealthStatus(UNHEALTHY_STATUS);
    }
}
```

`getSearchHealthStatus()` would dispatch to this method when AOSS is detected.

## Impact Summary

- Every OpenMetadata deployment using AOSS sees continuous spurious `ERROR` log entries, making it harder to identify real errors
- The **Settings → Health Check** panel incorrectly shows the search backend as unhealthy
- Monitoring/alerting systems that treat any `ERROR` log or the health status API as a signal produce false positives
- The reindexing auto-tune feature is silently non-functional on AOSS deployments

## Environment

- OpenMetadata: 1.12.4 / 1.12.5
- Search backend: AWS OpenSearch Serverless (AOSS)
- Deployment: EKS with IRSA authentication (`SEARCH_AWS_IAM_AUTH_ENABLED=true`, `SEARCH_AWS_SERVICE_NAME=aoss`)

Method	Endpoint	Used for
`clusterStats()`	`/_cluster/stats`	Reindexing auto-tune
`nodesStats()`	`/_nodes/stats`	JVM/CPU metrics for auto-tune
`getSearchHealthStatus()`	`/_cluster/health`	Service health status panel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`OpenSearchGenericManager`: Unsupported `/_cluster/stats` call causes spurious ERROR logs on AWS OpenSearch Serverless (AOSS) #27599

Problem

Additional Impact: Search Service Reported as Unhealthy

Detection

Proposed Fix

1. Gate cluster/node stats calls behind an AOSS check

2. Use `GET /` as the AOSS health check

Impact Summary

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpenSearchGenericManager: Unsupported /_cluster/stats call causes spurious ERROR logs on AWS OpenSearch Serverless (AOSS) #27599

Description

Problem

Additional Impact: Search Service Reported as Unhealthy

Detection

Proposed Fix

1. Gate cluster/node stats calls behind an AOSS check

2. Use GET / as the AOSS health check

Impact Summary

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`OpenSearchGenericManager`: Unsupported `/_cluster/stats` call causes spurious ERROR logs on AWS OpenSearch Serverless (AOSS) #27599

2. Use `GET /` as the AOSS health check