You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/batch/jobs-and-tasks.md
+21-15Lines changed: 21 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,13 @@
1
1
---
2
2
title: Jobs and tasks in Azure Batch
3
-
description: Learn about jobs and tasks and how they are used in an Azure Batch workflow from a development standpoint.
3
+
description: Learn about jobs and tasks and how they're used in an Azure Batch workflow from a development standpoint.
4
4
ms.topic: concept-article
5
5
ms.date: 03/21/2025
6
6
# Customer intent: "As a developer working with cloud-based batch processing, I want to understand how jobs and tasks are structured in a batch workflow, so that I can efficiently manage computational workloads and optimize task execution."
7
7
---
8
8
# Jobs and tasks in Azure Batch
9
9
10
-
In Azure Batch, a *task* represents a unit of computation. A *job* is a collection of these tasks. More about jobs and tasks, and how they are used in an Azure Batch workflow, is described below.
10
+
In Azure Batch, a *task* represents a unit of computation. A *job* is a collection of these tasks. More about jobs and tasks, and how they're used in an Azure Batch workflow, is described below.
11
11
12
12
## Jobs
13
13
@@ -21,7 +21,7 @@ You can assign an optional job priority to jobs that you create. The Batch servi
21
21
22
22
To update the priority of a job, call the [Update the properties of a job](/rest/api/batchservice/job/update) operation (Batch REST), or modify the [CloudJob.Priority](/dotnet/api/microsoft.azure.batch.cloudjob.priority) (Batch .NET). Priority values range from -1000 (lowest priority) to +1000 (highest priority).
23
23
24
-
Within the same pool, higher-priority jobs have scheduling precedence over lower-priority jobs. Tasks in lower-priority jobs that are already running won't be preempted by tasks in a higher-priority job. Jobs with the same priority level have an equal chance of being scheduled, and ordering of task execution is not defined.
24
+
Within the same pool, higher-priority jobs have scheduling precedence over lower-priority jobs. Tasks in lower-priority jobs that are already running won't be preempted by tasks in a higher-priority job. Jobs with the same priority level have an equal chance of being scheduled, and ordering of task execution isn't defined.
25
25
26
26
A job with a high-priority value running in one pool won't impact scheduling of jobs running in a separate pool or in a different Batch account. Job priority doesn't apply to [autopools](nodes-and-pools.md#autopools), which are created when the job is submitted.
27
27
@@ -30,7 +30,7 @@ A job with a high-priority value running in one pool won't impact scheduling of
30
30
You can use job constraints to specify certain limits for your jobs:
31
31
32
32
- You can set a **maximum wallclock time**, so that if a job runs for longer than the maximum wallclock time that is specified, the job and all of its tasks are terminated.
33
-
- You can specify the **maximum number of task retries** as a constraint, including whether a task is always retried or never retried. Retrying a task means that if the task fails, it will be requeued to run again.
33
+
- You can specify the **maximum number of task retries** as a constraint, including whether a task is always retried or never retried. Retrying a task means that if the task fails, it's requeued to run again.
34
34
35
35
### Job manager tasks and automatic termination
36
36
@@ -52,21 +52,21 @@ When you create a task, you can specify:
52
52
53
53
- The **command line** for the task. This is the command line that runs your application or script on the compute node.
54
54
55
-
It is important to note that the command line does not run under a shell. Therefore, it cannot natively take advantage of shell features like [environment variable](#environment-settings-for-tasks) expansion (this includes the `PATH`). To take advantage of such features, you must invoke the shell in the command line, such as by launching `cmd.exe` on Windows nodes or `/bin/sh` on Linux:
55
+
It's important to note that the command line doesn't run under a shell. Therefore, it can't natively take advantage of shell features like [environment variable](#environment-settings-for-tasks) expansion (this includes the `PATH`). To take advantage of such features, you must invoke the shell in the command line, such as by launching `cmd.exe` on Windows nodes or `/bin/sh` on Linux:
56
56
57
57
`cmd /c MyTaskApplication.exe %MY_ENV_VAR%`
58
58
59
59
`/bin/sh -c MyTaskApplication $MY_ENV_VAR`
60
60
61
-
If your tasks need to run an application or script that is not in the node's `PATH` or reference environment variables, invoke the shell explicitly in the task command line.
61
+
If your tasks need to run an application or script that isn't in the node's `PATH` or reference environment variables, invoke the shell explicitly in the task command line.
62
62
-**Resource files** that contain the data to be processed. These files are automatically copied to the node from Blob storage in an Azure Storage account before the task's command line is executed. For more information, see [Start task](#start-task) and [Files and directories](files-and-directories.md).
63
63
- The **environment variables** that are required by your application. For more information, see [Environment settings for tasks](#environment-settings-for-tasks).
64
64
- The **constraints** under which the task should execute. For example, constraints include the maximum time that the task is allowed to run, the maximum number of times a failed task should be retried, and the maximum time that files in the task's working directory are retained.
65
-
-**Application packages** to deploy to the compute node on which the task is scheduled to run. [Application packages](batch-application-packages.md) provide simplified deployment and versioning of the applications that your tasks run. Task-level application packages are especially useful in shared-pool environments, where different jobs are run on one pool, and the pool is not deleted when a job is completed. If your job has fewer tasks than nodes in the pool, task application packages can minimize data transfer since your application is deployed only to the nodes that run tasks.
65
+
-**Application packages** to deploy to the compute node on which the task is scheduled to run. [Application packages](batch-application-packages.md) provide simplified deployment and versioning of the applications that your tasks run. Task-level application packages are especially useful in shared-pool environments, where different jobs are run on one pool, and the pool isn't deleted when a job is completed. If your job has fewer tasks than nodes in the pool, task application packages can minimize data transfer since your application is deployed only to the nodes that run tasks.
66
66
- A **container image** reference in Docker Hub or a private registry and additional settings to create a Docker container in which the task runs on the node. You only specify this information if the pool is set up with a container configuration.
67
67
68
68
> [!NOTE]
69
-
> The maximum lifetime of a task, from when it is added to the job to when it completes, is 180 days. Completed tasks persist for 7 days; data for tasks not completed within the maximum lifetime is not accessible.
69
+
> The maximum lifetime of a task, from when it is added to the job to when it completes, is 180 days. Completed tasks persist for 7 days; data for tasks not completed within the maximum lifetime isn't accessible.
70
70
71
71
In addition to tasks you define to perform computation on a node, several special tasks are also provided by the Batch service:
72
72
@@ -88,7 +88,7 @@ However, the start task could also include reference data to be used by all task
88
88
89
89
Usually, you'll want the Batch service to wait for the start task to complete before considering the node ready to be assigned tasks. However, you can configure this differently as needed.
90
90
91
-
If a start task fails on a compute node, then the state of the node is updated to reflect the failure, and the node is not assigned any tasks. A start task can fail if there is an issue copying its resource files from storage, or if the process executed by its command line returns a nonzero exit code.
91
+
If a start task fails on a compute node, then the state of the node is updated to reflect the failure, and the node isn't assigned any tasks. A start task can fail if there's an issue copying its resource files from storage, or if the process executed by its command line returns a nonzero exit code.
92
92
93
93
If you add or update the start task for an existing pool, you must reboot its compute nodes for the start task to be applied to the nodes.
94
94
@@ -98,22 +98,22 @@ If you add or update the start task for an existing pool, you must reboot its co
98
98
> 1. You can use application packages to distribute applications or data across each node in your Batch pool. For more information about application packages, see [Deploy applications to compute nodes with Batch application packages](batch-application-packages.md).
99
99
> 2. You can manually create a zipped archive containing your applications files. Upload your zipped archive to Azure Storage as a blob. Specify the zipped archive as a resource file for your start task. Before you run the command line for your start task, unzip the archive from the command line.
100
100
>
101
-
> To unzip the archive, you can use the archiving tool of your choice. You will need to include the tool that you use to unzip the archive as a resource file for the start task.
101
+
> To unzip the archive, you can use the archiving tool of your choice. You need to include the tool that you use to unzip the archive as a resource file for the start task.
102
102
103
103
### Job manager task
104
104
105
105
You typically use a job manager task to control and/or monitor job execution. For example, job manager tasks are often used to create and submit the tasks for a job, determine additional tasks to run, and determine when work is complete.
106
106
107
-
However, a job manager task is not restricted to these activities. It is a full-fledged task that can perform any actions that are required for the job. For example, a job manager task might download a file that is specified as a parameter, analyze the contents of that file, and submit additional tasks based on those contents.
107
+
However, a job manager task isn't restricted to these activities. It's a full-fledged task that can perform any actions that are required for the job. For example, a job manager task might download a file that is specified as a parameter, analyze the contents of that file, and submit additional tasks based on those contents.
108
108
109
109
A job manager task is started before all other tasks. It provides the following features:
110
110
111
-
- It is automatically submitted as a task by the Batch service when the job is created.
112
-
- It is scheduled to execute before the other tasks in a job.
111
+
- It's automatically submitted as a task by the Batch service when the job is created.
112
+
- It's scheduled to execute before the other tasks in a job.
113
113
- Its associated node is the last to be removed from a pool when the pool is being downsized.
114
114
- Its termination can be tied to the termination of all tasks in the job.
115
-
- A job manager task is given the highest priority when it needs to be restarted. If an idle node is not available, the Batch service might terminate one of the other running tasks in the pool to make room for the job manager task to run.
116
-
- A job manager task in one job does not have priority over the tasks of other jobs. Across jobs, only job-level priorities are observed.
115
+
- A job manager task is given as the highest priority when it needs to be restarted. If an idle node isn't available, the Batch service might terminate one of the other running tasks in the pool to make room for the job manager task to run.
116
+
- A job manager task in one job doesn't have priority over the tasks of other jobs. Across jobs, only job-level priorities are observed.
117
117
118
118
### Job preparation and release tasks
119
119
@@ -157,6 +157,12 @@ Your client application or service can obtain a task's environment variables, bo
157
157
158
158
You can find a list of all service-defined environment variables in [Compute node environment variables](batch-compute-node-environment-variables.md).
159
159
160
+
## Known limitations
161
+
162
+
- Task stuck in running state: Batch service works together with compute nodes to manage task lifecycle. When a task is scheduled to a compute node for execution, the compute node is responsible to update the task's state from running the way to completed. If a compute node is preempted or lost connectivity to Batch service, its tasks stay in running state until Batch service can get chance to reschedule them to run with another compute node. If there's no other compute node, these tasks may stay in running state forever. To determine whether a task is stuck in running state, you can query the task to check if its associated node is unusable or deleted from the pool.
163
+
- When a job is terminated, Batch service only terminates its running tasks with accessible compute nodes. All existing active tasks, and running tasks with unusable nodes remain current state.
164
+
- When a task is requeued (for example due to preempted node, or pool resize operation with `Requeue` option), it's pushed back to the end of its job's queue. So it's possible the task is delayed to reschedule when there are other active tasks waiting in the same job.
165
+
160
166
## Next steps
161
167
162
168
- Learn about [files and directories](files-and-directories.md).
0 commit comments