Skip to content

Feature/add max new dagruns to schedule#64294

Draft
Nataneljpwd wants to merge 31 commits into
apache:mainfrom
Nataneljpwd:feature/add-max-new-dagruns-to-schedule
Draft

Feature/add max new dagruns to schedule#64294
Nataneljpwd wants to merge 31 commits into
apache:mainfrom
Nataneljpwd:feature/add-max-new-dagruns-to-schedule

Conversation

@Nataneljpwd

@Nataneljpwd Nataneljpwd commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

When new dagruns are created in bulk (i.e with triggerDagRunOperator), the scheduler might struggle with the amount created, and cause other dagruns to starve.

This is due to the sort order in get_running_dagruns_to_examine which selects (with a nulls first) by last scheduling decision, which means that if a lot of new dagruns are created, the scheduler will examine them first, and in situations where the dags have a lot of tasks (hundreds to tens of thousands) it can cause the scheduler to stall, as it has to both examine a lot of dagruns, and create new tasks for those dagruns.

When we have tried to tune the max_dagruns_per_loop_to_schedule we either got starvation of other dagruns OR the scheduler being reset due to not returning a heartbeat for a long time and failing the readiness probe.

To fix this, a new configuration is added, max_new_dagruns_per_loop_to_schedule which can help when a lot of new dagruns are created in large batches at the same time, and allow the scheduler to both look at existing dagruns (not starving them and causing them to timeout with no running / scheduled tasks) and create and manage the new dagruns.

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)
  • No


Important

🛠️ Maintainer triage note for @Nataneljpwd · by @potiuk · 2026-06-17 14:51 UTC

Helpful heads-up from the maintainers — please address before this PR can be reviewed:

  • Failing test jobs: Low dep tests:core / All-core:LowestDeps:14:3.10:Core...Serialization, MySQL tests: core / DB-core:MySQL:8.0:3.10:Core...Serialization, Postgres tests: core / DB-core:Postgres:14:3.10:Core...Serialization, Sqlite tests: core / DB-core:Sqlite:3.10:Core...Serialization. Reproduce and fix locally, then push.
  • See the Pull Request quality criteria.

The ball is in your court — you've been assigned to this PR. Fix the above, then mark it Ready for review.

Automated triage — may be imperfect; a maintainer takes the next look.

@Nataneljpwd Nataneljpwd marked this pull request as draft March 27, 2026 12:28
@Nataneljpwd Nataneljpwd marked this pull request as ready for review March 27, 2026 17:24

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a scheduler tuning knob to limit how many new (never-before-examined) running DagRuns are considered per scheduling loop, to reduce starvation/slowdown when large batches of DagRuns are created at once.

Changes:

  • Add scheduler.max_new_dagruns_per_loop_to_schedule config (default 0) and plumb it into DagRun selection.
  • Update DagRun.get_running_dag_runs_to_examine() to optionally split selection into “previously examined” vs “new” DagRuns.
  • Add/adjust unit tests to cover the new selection behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
airflow-core/src/airflow/models/dagrun.py Adds config-backed limit and changes running DagRun selection logic to optionally fetch “old” and “new” runs separately.
airflow-core/src/airflow/config_templates/config.yml Documents the new scheduler configuration option.
airflow-core/tests/unit/models/test_dagrun.py Adds tests for the new DagRun selection behavior and updates an existing test to handle the new return type.

Comment thread airflow-core/tests/unit/models/test_dagrun.py Outdated
Comment thread airflow-core/tests/unit/models/test_dagrun.py Outdated
Comment thread airflow-core/tests/unit/models/test_dagrun.py Outdated
Comment thread airflow-core/src/airflow/models/dagrun.py
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Apr 2, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread airflow-core/src/airflow/models/dagrun.py
Comment thread airflow-core/src/airflow/models/dagrun.py Outdated
Natanel Rudyuklakir and others added 2 commits April 15, 2026 22:04
@vatsrahul1001 vatsrahul1001 added this to the Airflow 3.3.0 milestone May 12, 2026
@potiuk potiuk removed the ready for maintainer review Set after triaging when all criteria pass. label May 18, 2026
@potiuk potiuk marked this pull request as draft May 18, 2026 10:48
@potiuk

potiuk commented May 18, 2026

Copy link
Copy Markdown
Member

@Nataneljpwd — Removing the ready for maintainer review label and converting back to draft. The branch now has merge conflicts with main that surfaced after the label was added.

The label's contract is that the PR is ready for maintainer review — a regression like this means the PR temporarily isn't. Rebase your branch onto the latest main, resolve conflicts, then mark "Ready for review" again to re-enter the queue.

git fetch upstream main && git rebase upstream/main, resolve, git push --force-with-lease. See the working-with-git docs.

No rush.


Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you.

Natanel Rudyuklakir added 2 commits May 21, 2026 19:44
@eladkal eladkal marked this pull request as ready for review May 27, 2026 09:34
@eladkal eladkal requested a review from kaxil May 27, 2026 14:29
@potiuk

potiuk commented May 27, 2026

Copy link
Copy Markdown
Member

@Nataneljpwd — There is 1 unresolved review thread on this PR from kaxil, and you have pushed commits since the review (most recently the rebase that cleared the merge conflict). Could you confirm whether you believe the feedback is fully addressed and the PR is ready for maintainer review confirmation?

If yes, please mark the thread as resolved and ping the reviewer (kaxil) for a final look. They will either label the PR ready for maintainer review or follow up with additional feedback.

If you are still working on the thread, please reply with what is outstanding so the thread stays unresolved on purpose.


Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you.

@eladkal

eladkal commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

cc @kaxil waiting for 2nd review

@kaxil

kaxil commented Jun 1, 2026

Copy link
Copy Markdown
Member

This one also needs a review from @ashb .

Also cc @BIS7 @ephraimbuddy -- who might be interested in reviewing it

@ashb ashb left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this is the right fix. Tuning and configuring the scheduler is already nigh on impossible I am wary of adding more.

Additionally, couldn't the already existing max active runs controls be used here? That would keep most of the dagruns in the Queued state, meaning the scheduler only looks at at most 16( by default I think) newly created runs and massively reduces the impact of "cause the scheduler to stall, as it has to both examine a lot of dagruns, and create new tasks for those dagruns." as it doesn't do that. That is why DagRuns can exist in the queued state.

Did you try this existing tunable first?

@Nataneljpwd

Copy link
Copy Markdown
Contributor Author

I'm not convinced that this is the right fix. Tuning and configuring the scheduler is already nigh on impossible I am wary of adding more.

We have tried running both the scheduler count and the max dagruns per loop to schedule, and each time we had a different issue but I understand the concern, we have this locally and it fixed our problem, the main problem being is that dags are created in batches at our clusters, sometimes very large batches, and a new dagrun is heavier to process than a running one, mainly due to the fact of having to create tasks for it (when it starts) rather than other dagruns which occasionally (once tasks finish) create new tasks, while also having dagruns not moved to running due to processing large batches of new dagruns, we have tried increasing the scheduler count quite a bit, in addition to increasing the max dagruns per loop to schedule, which caused scheduler heartbeat timeouts as we had a lot of runs with mapped tasks, and so we had to also increase that configuration to a very big (10 minutes), that is in addition to dagruns timing out due to not being examined, and we even saw in the gant that there were large pauses between tasks where no task existed, and so dagruns could cause other dagruns to miss their sla, or even if I create a medium backfill, along with my regular dags which include mapped tasks, when I increase the number of examined dagruns, I get one of the issues stated above.

Additionally, couldn't the already existing max active runs controls be used here? That would keep most of the dagruns in the Queued state, meaning the scheduler only looks at at most 16( by default I think) newly created runs and massively reduces the impact of "cause the scheduler to stall, as it has to both examine a lot of dagruns, and create new tasks for those dagruns." as it doesn't do that. That is why DagRuns can exist in the queued state.

As states above, we had tried to tune it, we changed it to around 300 and even tripled the scheduler count, yet for both batch triggered runs and large backfills we still experienced the issue, we even tried dividing the batch size by a few times (spread more evenly), the scheduler either got a lot of queued dagruns and would never finish the batch OR when it was able to finish the batch it was reset quite often due to not emmiting a heartbeat and failing the readiness probe / having an oom / other dagruns timing out (which Is why we didn't increase the number beyond 300)

Did you try this existing tunable first?

As states above, yes, we have tried, I am pretty sure we had tried all related configurations, as I have gone over all of the scheduler configurations

@ashb

ashb commented Jun 2, 2026

Copy link
Copy Markdown
Member

No, not those. I mean the max_active_runs parameter to a dag

@Nataneljpwd

Copy link
Copy Markdown
Contributor Author

No, not those. I mean the max_active_runs parameter to a dag

That as well, yet when our clients changed this we were unable to stay within the Dag's sla and more runs were created in a day than finished, yet it also happens when we have a lot of dags (over 1000) in one airflow instance where we limit the max active runs to 40 with a cluster policy, yet most clients use the default of 16

@ephraimbuddy ephraimbuddy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a deployment specific mitigation. Do you have a simple repro/benchmark showing max_active_runs and other existing scheduler knobs cannot solve this?

Also, as I read it, the starvation comes from the nulls_first(last_scheduling_decision) ordering: never examined runs are always pulled to the front. Have you considered fixing the ordering itself instead? I think that would address the starvation without adding a new knob. Something like the below:

.order_by(
    nulls_first(cast("ColumnElement[Any]", BackfillDagRun.sort_ordinal), session=session),
    coalesce(cls.last_scheduling_decision, cls.run_after),
    cls.run_after,
)

Fair aging: never-examined runs are ordered by when they became eligible (run_after), not pulled ahead of everything. A run examined long ago still outranks one examined a second ago.

@potiuk

potiuk commented Jun 9, 2026

Copy link
Copy Markdown
Member

@Nataneljpwd A few things need addressing before review — see our Pull Request quality criteria.

  • Merge conflicts with main. See docs.

No rush.

Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants