Is this a critical security issue?
Describe the Bug
The ca_refresh_interval setting is not respected in practice. Agents attempt
to refresh the CA certificate on every run instead of at the configured
interval (default: 24 hours), resulting in an unnecessary HTTP request to the
CA endpoint per run.
The analogous crl_refresh_interval logic contains the same structural issue
but does not manifest in practice (see "Why CRL is not affected" below).
Actual Behavior
The agent attempts to refresh the CA certificate on every agent run. Debug
logs show "Refreshing CA certificate" / "CA certificate is unmodified"
appearing on every run, not at the configured interval.
The mtime on $certdir/ca.pem is never updated after the initial download,
so needs_refresh? always evaluates to true.
Expected Behavior
With ca_refresh_interval = 1d, the agent should attempt to refresh the CA
certificate approximately once per 24 hours.
Steps to Reproduce
- Configure an agent with default
ca_refresh_interval (1d)
- Let the agent run and successfully download the CA
- Trigger subsequent agent runs at any cadence
- Observe:
puppet agent -t --debug 2>&1 | grep -E "Refreshing CA|unmodified"
prints on every run, not once per 24 hours
stat $(puppet config print localcacert) shows mtime is not updated
Environment
Version 8.26.2
Platform Rocky Linux release 9.7 (Blue Onyx)
Additional Context
Root Cause
In lib/puppet/ssl/state_machine.rb, NeedCACerts#refresh_ca (lines 111-135)
updates @cert_provider.ca_last_update only on the success (HTTP 200) path.
When the server responds 304 Not Modified, download_ca raises
Puppet::HTTP::ResponseError, the rescue handler logs "CA certificate is
unmodified" but does not update ca_last_update:
def refresh_ca(ssl_ctx, last_update)
Puppet.info(_("Refreshing CA certificate"))
next_ctx = [download_ca(ssl_ctx, last_update), true]
@cert_provider.ca_last_update = Time.now # only reached on HTTP 200
next_ctx
rescue Puppet::HTTP::ResponseError => e
if e.response.code == 304
Puppet.info(_("CA certificate is unmodified, using existing CA certificate"))
# ca_last_update is not updated here
else
Puppet.info(_("Failed to refresh CA certificate..."))
end
[ssl_ctx, false]
end
Since ca_last_update is backed by the mtime of ca.pem
(lib/puppet/x509/cert_provider.rb:159-173), and neither save_cacerts nor
ca_last_update= runs on the 304 path, the mtime stays at whatever it was
after the initial download. On the next run, needs_refresh? sees a
stale mtime and schedules another refresh — repeating indefinitely.
Why CRL is Not Affected in Practice
NeedCRLs#refresh_crl (lines 209-233) has the identical structural issue,
but in typical deployments the server returns HTTP 200 for CRL requests
often enough that save_crls writes the file and bumps the mtime naturally.
For the CA certificate, which rarely changes, 304 is the steady-state
response and the bug is visible.
Suggested Fix
On HTTP 304, update the last-update timestamp — a 304 is positive
confirmation that the local copy is current, so the TTL should be reset:
rescue Puppet::HTTP::ResponseError => e
if e.response.code == 304
Puppet.info(_("CA certificate is unmodified, using existing CA certificate"))
@cert_provider.ca_last_update = Time.now
else
Puppet.info(_("Failed to refresh CA certificate, using existing CA certificate: %{message}") % { message: e.message })
end
[ssl_ctx, false]
end
The same fix applies to refresh_crl for correctness, even though the
symptom is latent there.
Disclosure: initial analysis of this issue was done with AI assistance
(Claude). The code references, reproduction steps, and root cause have been
verified against the source at the paths cited above.
Relevant log output
Is this a critical security issue?
Describe the Bug
The
ca_refresh_intervalsetting is not respected in practice. Agents attemptto refresh the CA certificate on every run instead of at the configured
interval (default: 24 hours), resulting in an unnecessary HTTP request to the
CA endpoint per run.
The analogous
crl_refresh_intervallogic contains the same structural issuebut does not manifest in practice (see "Why CRL is not affected" below).
Actual Behavior
The agent attempts to refresh the CA certificate on every agent run. Debug
logs show "Refreshing CA certificate" / "CA certificate is unmodified"
appearing on every run, not at the configured interval.
The mtime on
$certdir/ca.pemis never updated after the initial download,so
needs_refresh?always evaluates to true.Expected Behavior
With
ca_refresh_interval = 1d, the agent should attempt to refresh the CAcertificate approximately once per 24 hours.
Steps to Reproduce
ca_refresh_interval(1d)puppet agent -t --debug 2>&1 | grep -E "Refreshing CA|unmodified"prints on every run, not once per 24 hours
stat $(puppet config print localcacert)shows mtime is not updatedEnvironment
Version 8.26.2
Platform Rocky Linux release 9.7 (Blue Onyx)
Additional Context
Root Cause
In
lib/puppet/ssl/state_machine.rb,NeedCACerts#refresh_ca(lines 111-135)updates
@cert_provider.ca_last_updateonly on the success (HTTP 200) path.When the server responds 304 Not Modified,
download_caraisesPuppet::HTTP::ResponseError, the rescue handler logs "CA certificate isunmodified" but does not update
ca_last_update:Since
ca_last_updateis backed by the mtime ofca.pem(
lib/puppet/x509/cert_provider.rb:159-173), and neithersave_cacertsnorca_last_update=runs on the 304 path, the mtime stays at whatever it wasafter the initial download. On the next run,
needs_refresh?sees astale mtime and schedules another refresh — repeating indefinitely.
Why CRL is Not Affected in Practice
NeedCRLs#refresh_crl(lines 209-233) has the identical structural issue,but in typical deployments the server returns HTTP 200 for CRL requests
often enough that
save_crlswrites the file and bumps the mtime naturally.For the CA certificate, which rarely changes, 304 is the steady-state
response and the bug is visible.
Suggested Fix
On HTTP 304, update the last-update timestamp — a 304 is positive
confirmation that the local copy is current, so the TTL should be reset:
The same fix applies to
refresh_crlfor correctness, even though thesymptom is latent there.
Disclosure: initial analysis of this issue was done with AI assistance
(Claude). The code references, reproduction steps, and root cause have been
verified against the source at the paths cited above.
Relevant log output