Drop and re-implement lockfile and python-daemon dependencies#65513
Drop and re-implement lockfile and python-daemon dependencies#65513jscheffl wants to merge 10 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes the deprecated lockfile and python-daemon dependencies from Airflow core by vendoring minimal PID-file and daemonization utilities into airflow.utils, and updates affected providers/tests to use the new implementations (with version-based fallbacks in providers).
Changes:
- Drop
lockfileandpython-daemonfromairflow-coredependencies and introduce vendored replacements (airflow.utils.pidfile,airflow.utils.daemon). - Switch core daemon CLI utilities and process PID checks to the vendored implementations, updating related unit tests.
- Update Celery and Edge3 providers to conditionally import PID-file helpers based on the running Airflow version.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| providers/edge3/src/airflow/providers/edge3/version_compat.py | Adds an Airflow 3.2.2+ compatibility constant used for conditional imports. |
| providers/edge3/src/airflow/providers/edge3/cli/worker.py | Uses vendored PID-file removal helper on Airflow 3.2.2+, falls back to lockfile otherwise. |
| providers/edge3/src/airflow/providers/edge3/cli/signalling.py | Uses vendored PID-file helpers on Airflow 3.2.2+, falls back to lockfile otherwise. |
| providers/celery/tests/unit/celery/cli/test_celery_command.py | Updates daemon-related mocks to patch DaemonContext directly. |
| providers/celery/src/airflow/providers/celery/version_compat.py | Adds an Airflow 3.2.2+ compatibility constant used for conditional imports. |
| providers/celery/src/airflow/providers/celery/cli/celery_command.py | Uses vendored PID-file helpers on Airflow 3.2.2+, falls back to lockfile otherwise. |
| airflow-core/tests/unit/utils/test_pidfile.py | Adds unit tests for the new vendored PID-file utilities. |
| airflow-core/tests/unit/utils/test_daemon.py | Adds unit tests for the new vendored daemon context implementation. |
| airflow-core/tests/unit/cli/commands/test_kerberos_command.py | Updates daemon-related mocks to patch DaemonContext. |
| airflow-core/tests/unit/cli/commands/test_api_server_command.py | Updates daemon-related mocks to patch DaemonContext. |
| airflow-core/src/airflow/utils/process_utils.py | Switches PID lock import from lockfile to vendored PIDLockFile. |
| airflow-core/src/airflow/utils/pidfile.py | New vendored PID-file utilities replacing deprecated lockfile. |
| airflow-core/src/airflow/utils/daemon.py | New vendored daemon context replacing python-daemon. |
| airflow-core/src/airflow/cli/commands/daemon_utils.py | Switches CLI daemonization to vendored DaemonContext and TimeoutPIDLockFile. |
| airflow-core/pyproject.toml | Removes lockfile and python-daemon from core dependencies. |
| airflow-core/newsfragments/65513.significant.rst | Documents dependency removal and provider compatibility implications. |
fccb1d4 to
3af8304
Compare
7690ba2 to
e80a44f
Compare
59d1047 to
0139159
Compare
There was a problem hiding this comment.
We (used) to have an explicit airflow._vendor folder, and vendoring the code was managed by the vendoring tool.
Is this an actual vendor, or just "the bits we need".
Regardless, I think we should probably add it's license to your NOTICE(?) file.
There was a problem hiding this comment.
I am not sure whether it was "plain vendoring-in", I asked Claude to make it. Did not compare the original package code and the resulting.
So in the PR I rather think it is a "logical vendoring-in" of the function selective to replace the dependency.
Do you have any pointer to a rule that helps me in fixing this?
There was a problem hiding this comment.
See airflow-core/src/airflow/_vendor/README.md
There was a problem hiding this comment.
Have checked what Claude made from the original code in:
(1) pidfile: https://opendev.org/openstack/pylockfile/src/commit/c8798cedfbc4d738c99977a07cde2de54687ac6c/lockfile/pidlockfile.py
(2) daemon: https://pagure.io/python-daemon/blob/main/f/src/daemon/daemon.py
It is actually not a vendoring-in but a re-implementation that replaces the functionality in own code. It is not a pure copy&paste but really an (AI assisted) re-implementation. Changed title of the PR therefore.
There was a problem hiding this comment.
Yeah. I think we will have less and less of vendori g I. Classic sense. It makes way more sense to only take what we need from a library (of course following license requirements) and keep it in our code. This is what everyon e predict will happen - either you have well maintained library that you can trust for security or maintenance - or you just take the parts you need and maintain in yourself (or rather with your agents) - it will be cheaper, faster, more secure - because you do not have to wait for them to release security fixed but you can scan and implement them yourself.
My prediction (and I will be working towarda that) is that Airflow core will go down to maybe few dozens of dependencies eventually).
There was a problem hiding this comment.
Adjusted NOTICE in 15dc5d0 - is this correct? Never made such.
There was a problem hiding this comment.
My prediction (and I will be working towarda that) is that Airflow core will go down to maybe few dozens of dependencies eventually).
Dear god I to to everything that this never comes to pass. It is so opposed to the open source ethos that brought me to Airflow in the first place that I cannot even begin to fathom it.
Dependencies are not the enemy; We are building on the shoulder of giants.
15dc5d0 to
04f4e4a
Compare
|
FYI - I would plan to push this now to Airflow 3.3.0 and not backport. If somebody wants to have this fixed in 3.2.2 let me know. But packages are deprecated since 5+ years, so I see no urgency as also there is no vulnerability known. No urgency making this into a patch release. |
No urgency for me. |
I agree |
|
Args, static checks fail in ... which totally makes sense. (Same question posted in Slack for maintainer feedback in https://apache-airflow.slack.com/archives/C06K9Q5G2UA/p1777916468610069) |
04f4e4a to
fda2b45
Compare
There was a problem hiding this comment.
Is there an actual problem with the versions released on pypi? Do they not work, or issue deprecation warnings in code? Is someone threatening to remove them from PyPI?
Until one of these comes to pass, Code that Works and is Released should continue to be used, there is no harm in using 11 year old code if it works. It doesn't matter if it is unmaintained if it doesn't need changes.
(Non blocking, but very strongly held) -1 we do not need to do this. lockfile works, just use it.
If we really want to not have these two as deps (again, I don't see the problem as they have been working for years, and aren't broken right now) then I think we should _vendor them as is, not reimplement it. It makes it clearer that "we are carrying this code" vs "we have written this code ourselves". And the less code we can write the better.
I think we have to definitely have a discussion about it in devlist. This is a good call to be clear about some criteria that we apply and I will definitely lead a discussion about this very soon. |
fda2b45 to
ed8f8e2
Compare
|
Marking DRAFT until the discussion (and dust) settled. |
As the python library lockfile is deprecated as well as the currently used dependency "python-daemon" since years still carries the same transitive dependency (has not fixed the issue since 5+ years!) in https://pagure.io/python-daemon/issue/42 even though there is a PR open since 3+ years (!) in https://pagure.io/python-daemon/pull-request/81 in this PR we pragmatically re-implement the logic - it is each less than 500 LoC and cutting off the dependency is easier assuming that PRs not fixed since years is like a dead horse.
Most problematic both Celery and Edge provider packages have a dependency as well including daemon tooling in Airflow core. So this PR also adjusts providers. Known limitation:
Marking with milestone 3.2.2 assuming the core parts of the PR will be back-ported.
Was generative AI tooling used to co-author this PR?
Claude Opus 4.6
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.