Skip to content

CLOS 4333: Correct issues detected in no-auth migration testing#59

Open
prilr wants to merge 8 commits intocloudlinux:cloudlinuxfrom
prilr:CLOS-4333
Open

CLOS 4333: Correct issues detected in no-auth migration testing#59
prilr wants to merge 8 commits intocloudlinux:cloudlinuxfrom
prilr:CLOS-4333

Conversation

@prilr
Copy link
Copy Markdown
Collaborator

@prilr prilr commented Apr 30, 2026

Several previously introduced modifications caused failures in end-to-end tests.

This PR addresses them in preparation for the next release.

Intended to be a follow-up to #58 and is based on it.

prilr and others added 8 commits April 27, 2026 12:06
Systems migrated to the no-auth (SWNG) scheme no longer have CLN as a
package source. Several CL-specific actors assumed CLN was always active
and either crashed on missing files or produced spurious inhibitors.

Add cln_detect.is_cln_configured() — True when the CLN plumbing is
present and not explicitly disabled (registration file exists + spacewalk
plugin installed + plugin enabled). Gate these actors on it:

- switch_cln_channel: skip the cln-switch-channel call on no-auth systems.
  Also downgrade the failed-switch inhibitor to a MEDIUM report, since a
  failure on a transitional system where CLN plumbing lingers but is no
  longer usable should not block the upgrade — CL9 packages come from
  cl-channel / cloudlinux9-baseos instead.
- pin_cln_mirror / unpin_cln_mirror: no-op on no-auth systems; also wrap
  the up2date update in try/except in pin_cln_mirror in case the file was
  not shipped on the target.
- check_rhn_version_override / reset_rhn_version_override: skip on no-auth
  systems and fall back cleanly when /etc/sysconfig/rhn/up2date is
  missing. reset_rhn_version_override: also fix a pre-existing bug where
  rebinding  inside the loop did not update config_data, so the
  reset was silently a no-op.

enable_yum_spacewalk_plugin is not touched here — CLOS-3960 has concurrent
work on it.
The first commit on this branch named the helper is_cln_configured()
and the actor comments said things like 'CLN is not configured here'.
That conflated two separate concerns and was misleading: CLN registration
is still in use on no-auth (SWNG) systems for licensing and inventory —
what the no-auth migration changes is only the package channel (the
spacewalk DNF plugin no longer delivers packages from a CLN channel).

Rename:

  is_cln_configured()  ->  is_cln_package_channel_active()

so the function name reflects what the actors actually need to gate on,
and rewrite its docstring + module header to spell out the package-channel
vs registration distinction.

Update each of the five actor comments accordingly. The detection logic
itself is unchanged — registration state plus a non-disabled spacewalk
plugin remains the right heuristic for 'CLN is delivering packages'.
Tests updated to match the new name and to phrase scenarios in terms of
the channel rather than 'CLN configured'.
The previous two commits introduced em-dashes (U+2014) in docstrings and
inline comments. The upstream make lint target now greps for any
non-ASCII byte (commit 92aee84) because Python 2.7 source files reject
non-ASCII without an encoding declaration, and the leapp framework still
supports running on 2.7 in places. Replace each em-dash with an ASCII
hyphen so the files lint clean against that gate.
The earlier patch passed `aliases=['non-interactive']` to @command_opt,
which threads through to leapp framework's `add_option`. In 0.18.0 that
function does not accept an `aliases` kwarg, so leapp loading failed with
`add_option() got an unexpected keyword argument 'aliases'` on every CLI
invocation - including `leapp preupgrade` and `leapp upgrade`.

argparse natively supports multiple long names per option; we just need to
get them past add_option. Add a local `_command_opt_with_aliases` helper
that wraps the framework's `_add_opt` directly, so both --nowarn and
--non-interactive land on the same argparse argument (dest=nowarn). No
framework change required, no consumer changes (args.nowarn keeps working).

Verified on a CL8 source VM with leapp-0.18.0-2.el8: `leapp upgrade --help`
now prints both names as a single option and parses --non-interactive.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…e bdb_ro warnings"

This reverts commit be4ece2.

The new call site sat inside _prepare_perform, which mounts an overlayfs at
/installroot. rpmdb --rebuilddb does an atomic directory rename to swap the
old database with the freshly-converted (bdb -> sqlite) one. Across overlay
layers that rename returns EXDEV, so rpm prints

    error: failed to replace old database with new database!
    error: replace files in /var/lib/rpm with files from /var/lib/rpmrebuilddb.1 to recover

and the dnf_transaction_check actor crashes with exit code 1, blocking every
CL8 -> CL9 preupgrade.

The two existing _rebuild_rpm_db call sites (perform_transaction_install,
install_initramdisk_requirements) use _prepare_transaction with binds=['/:/installroot'],
so the rename stays on a single filesystem and works. Those are the load-bearing
rebuilds for the actual upgrade. Dropping the new preupgrade-time call only
restores the harmless bdb_ro warnings; the upgrade transaction itself is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The Actor base class in leapp framework 0.18+ runs

    self.config = retrieve_config(self.config_schemas)

in __init__, which clobbers any class-level `config` attribute the actor
defined. Our class-level `config = '/etc/dnf/plugins/spacewalk.conf'` was
silently turned into the runtime config dict, so the FirstBoot phase crashed
with `TypeError: stat: path should be string, ..., not dict` on every CL
upgrade (CLOS-3960 introduced the assignment, but the rename to DEFAULT_CONFIG_PATH
in CLOS-3960's library didn't touch the actor's class attribute name).

Rename to CONFIG_PATH so the framework leaves it alone. _enable_plugin()
already silently no-ops when the file is missing, so no_auth systems
(where the spacewalk plugin is removed by rhn-client-tools >= 3.0.1)
keep skipping cleanly.

Verified live on a CL9 machine: patched actor loads,
process() reaches _enable_plugin with the string path, and silently
returns False/None when the plugin config is absent.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant