Skip to content

Commit 8370b3a

Browse files
Updates
1 parent 106a24f commit 8370b3a

2 files changed

Lines changed: 14 additions & 11 deletions

File tree

support/azure/service-fabric/cluster/troubleshoot-service-fabric-repair-jobs.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ ms.reviewer: ashukumar, v-ryanberg
77
ms.editor: v-gsitser
88
ms.service: azure-service-fabric
99
services: service-fabric
10+
ms.custom: sap:Cluster related issues
1011
ms.date: 01/20/2026
1112
# Customer intent: As a Service Fabric customer, I want to analyze the reason why a repair job is stuck using Service Fabric Explorer.
1213
---
@@ -75,36 +76,36 @@ Finally, in the Completed state, the task is finished and no further state chang
7576

7677
### Infrastructure Jobs view
7778

78-
To view jobs that Service Fabric receives for approval, go to the **Infrastructure Jobs** tab in the cluster view. Each entry includes a **Job ID** which stays the same across and outside of Service Fabric. The **Acknowledgement Status** shows whether Service Fabric approves the job with one of the following states:
79+
To view jobs that Service Fabric receives for approval, select **Infrastructure Jobs** in the cluster view. Each entry includes a **Job ID** which stays the same across and outside of Service Fabric. The **Acknowledgement Status** shows whether Service Fabric approves the job with one of the following states:
7980

8081
- **WaitingForAcknowledgement** - The job is still waiting for approval.
8182
- **Acknowledged** - Service Fabric approves the job.
8283

8384
Jobs only appear here when they're present in the received document. In addition to the **Job ID** and **Acknowledgement Status**, the **Impact Types** section displays the nature of the job’s impact. The **Current Repair Task** section shows which repair task is actively running for job approval on the Service Fabric side. By selecting **All Repair Tasks**, you can view the status of every repair task associated with the current job.
8485

85-
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/cluster-infrastructure-job-view.png" alt-text="Screenshot of the Infrastructure Jobs tab in Service Fabric Explorer showing job ID, acknowledgement status, and impact types." lightbox="media/troubleshoot-service-fabric-repair-jobs/cluster-infrastructure-job-view.png":::
86+
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/cluster-infrastructure-job-view.png" alt-text="Screenshot of the Infrastructure Jobs view in Service Fabric Explorer showing job ID, acknowledgement status, and impact types." lightbox="media/troubleshoot-service-fabric-repair-jobs/cluster-infrastructure-job-view.png":::
8687

87-
### Repair Jobs and Health Check view
88+
### Repair Jobs and Health Checks view
8889

89-
To view individual and all repair tasks associated with a cluster, go to the **Repair Jobs** tab. This displays pending repair tasks, completed repair tasks, or cancelled repair tasks. You can also see the state for any pending task.
90+
To view individual and all repair tasks associated with a cluster, select **Repair Jobs**. This displays pending repair tasks, completed repair tasks, or cancelled repair tasks. You can also see the state for any pending task.
9091

9192
If a repair task state is Created, Claimed, or Preparing, it's not yet approved by Service Fabric. Once a repair task transitions to the Approved state, it's considered approved and is then forwarded to the Repair Executor for the corresponding job.
9293

93-
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/repair-task-view.png" alt-text="Screenshot of the Repair Jobs tab in Service Fabric Explorer showing repair task states." lightbox="media/troubleshoot-service-fabric-repair-jobs/repair-task-view.png":::
94+
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/repair-task-view.png" alt-text="Screenshot of the Repair Jobs view in Service Fabric Explorer showing repair task states." lightbox="media/troubleshoot-service-fabric-repair-jobs/repair-task-view.png":::
9495

95-
If a repair task gets stuck in the Preparing state, it's either stuck in a health check or a safety check. An unhealthy entity in the cluster (including customer applications as well as system applications) can cause the health check to fail. To determine if the task is stuck in a health check, first verify whether **Preparing** or **Restoring Health Check** is enabled based on the state where the task is stuck. In the **Repair Task** view, expanding the task shows the health check status, indicating if it's enabled.
96+
If a repair task gets stuck in the Preparing state, it's either stuck in a health check or a safety check. An unhealthy entity in the cluster (including customer applications as well as system applications) can cause the health check to fail. To determine if the task is stuck in a health check, first verify whether **Preparing Health Check** or **Restoring Health Check** is enabled based on the state where the task is stuck. In the **Repair Task** view, expanding the task shows the health check status, indicating if it's enabled.
9697

9798
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/cluster-health-check.png" alt-text="Screenshot of an expanded repair task showing health check status and preparing health check details." lightbox="media/troubleshoot-service-fabric-repair-jobs/cluster-health-check.png":::
9899

99100
If enabled, **Repair Task History** shows that the health check started but didn't complete, confirming that the task is stuck in the Health Check phase.
100101

101-
### Safety Check view
102+
### Safety Checks view
102103

103-
A repair task can get stuck in the Safety Check phase only if it has an impact on any node. This can be verified by checking the **Impact** section in the **Repair Task** view. If a node impact is present, you can identify which Safety Check is causing the delay by inspecting each impacted node individually. Select the node from the **Node List**. In the **Safety Check** section, you’ll see the specific check where the task is stuck. The **Repair Task ID** is also displayed here, indicating which repair task is responsible for the node deactivation and safety check.
104+
A repair task can get stuck in the Safety Check phase only if it has an impact on any node. This can be verified by checking the **Impact** section in the **Repair Task** view. If a node impact is present, you can identify which Safety Check is causing the delay by inspecting each impacted node individually. Select the node from the **Node List**. In the **Safety Checks** section, you’ll see the specific check where the task is stuck. The **Repair Task ID** is also displayed here, indicating which repair task is responsible for the node deactivation and safety check.
104105

105106
For example, in the following screenshot, the repair task is stuck in the **EnsureSeedNodeQuorum** safety check.
106107

107-
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/safety-check-view.png" alt-text="Screenshot of the Safety Check view in Service Fabric Explorer showing the specific check where the task is stuck." lightbox="media/troubleshoot-service-fabric-repair-jobs/safety-check-view.png":::
108+
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/safety-check-view.png" alt-text="Screenshot of the Safety Checks view in Service Fabric Explorer showing the specific check where the task is stuck." lightbox="media/troubleshoot-service-fabric-repair-jobs/safety-check-view.png":::
108109

109110
If there are no errors in **Infrastructure Service** related to a repair task and the task has entered the Executing state, it means the job’s acknowledgment status is Acknowledged for Impact Start. Similarly, if the repair task transitions to the Completed state, it indicates that the job’s acknowledgment status is Acknowledged for Impact End.
110111

@@ -124,6 +125,6 @@ To check the health of the Infrastructure Service or Repair Manager Service, sel
124125

125126
### Job throttling status for Infrastructure Service
126127

127-
To check if any job is being throttled for a specific Infrastructure Service, select the service > **Health Evaluation** > **All**. Look for health events related to job throttling. If a job is throttled, the job ID along with the reason for throttling is displayed.
128+
To check if any job is being throttled for a specific Infrastructure Service, select the service > **Health Evaluations** > **All**. Look for health events related to job throttling. If a job is throttled, the job ID along with the reason for throttling is displayed.
128129

129-
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/cluster-job-throttling-status.png" alt-text="Screenshot of the Job throttling view in Service Fabric Explorer." lightbox="media/troubleshoot-service-fabric-repair-jobs/cluster-job-throttling-status.png":::
130+
:::image type="content" source="media/troubleshoot-service-fabric-repair-jobs/cluster-job-throttling-status.png" alt-text="Screenshot of the Health Evaluations view in Service Fabric Explorer." lightbox="media/troubleshoot-service-fabric-repair-jobs/cluster-job-throttling-status.png":::

support/azure/service-fabric/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,6 @@
22
href: welcome-service-fabric.yml
33
- name: Azure Service Fabric troubleshooter
44
href: cluster/fabric-logs.md
5+
- name: Troubleshoot unapproved repair jobs using Service Fabric Explorer
6+
href: cluster/troubleshoot-service-fabric-repair-jobs.md
57

0 commit comments

Comments
 (0)