Puppet cert clean not working
A Redditor provided a detailed step-by-step that seems to have resolved the issue. The three key things were:
- Make sure that the clocks are in sync (they were)
- Make sure that the puppet service is stopped (it wasn't)
- Invoke puppet agent with an explicitly defined target for the master, e.g.
puppet agent -t --server master1.example.com
That combination of things got us past the cert issue.
Related videos on Youtube
Glenn Lasher
UNIX System Engineer for Excelsior College in Albany, NY. Linux user for 20+ years.
Updated on September 18, 2022Comments
-
Glenn Lasher over 1 year
So, I am standing up a new server to replace an existing one. Should be easy, right? Revoke the old cert, create a new one and off you go. Here's the loop I am stuck in:
I've redacted the server names, cert fingerprint and domain. The servers shown below are:
- Slave1 -- The machine that will be the partner of the one that is having issues. It is only mentioned below to prove one of the details.
- Slave2 -- The machine that is giving me issues.
- Master1 -- The puppet master (obviously)
On new build
[root@slave2 ~]# puppet agent -t Error: Could not request certificate: The certificate retrieved from the master does not match the agent's private key. Certificate fingerprint: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:2F:F1 To fix this, remove the certificate from both the master and the agent and then start a puppet run, which will automatically regenerate a certficate. On the master: puppet cert clean slave2.example.com On the agent: rm -f /var/lib/puppet/ssl/certs/slave2.example.com.pem puppet agent -t Exiting; failed to retrieve certificate and waitforcert is disabled
Okay, that's predictable and fully expected because this is a new server using an old name. Now on the master:
[root@master1 ~]# puppet cert clean slave2.example.com Notice: Revoked certificate with serial 154
Note that there's nothing about the key files getting removed. This is because they are not there. Proof:
[root@master1 ~]# ls /var/lib/puppet/ssl/ca/signed/slave1.example.com.pem /var/lib/puppet/ssl/ca/signed/slave1.example.com.pem [root@master1 ~]# ls /var/lib/puppet/ssl/ca/signed/slave2.example.com.pem ls: cannot access /var/lib/puppet/ssl/ca/signed/slave2.example.com.pem: No such file or directory
Okay, good. Now go back to the slave to complete the procedure by removing the .pem file and running puppet agent again:
[root@slave2 ~]# rm -f /var/lib/puppet/ssl/certs/slave2.example.com.pem [root@slave2 ~]# puppet agent -t Info: Caching certificate for slave2.example.com Error: Could not request certificate: The certificate retrieved from the master does not match the agent's private key. Certificate fingerprint: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:2F:F1 To fix this, remove the certificate from both the master and the agent and then start a puppet run, which will automatically regenerate a certficate. On the master: puppet cert clean slave2.example.com On the agent: rm -f /var/lib/puppet/ssl/certs/slave2.example.com.pem puppet agent -t Exiting; failed to retrieve certificate and waitforcert is disabled
...and we are right back where we started with no change in outcome.
One last sanity check:
[root@master1 ~]# puppet cert list -a | grep -i slave2
...and there are no matches.
What am I doing wrong?
Addendum:
I'm inclined to believe that it is on the master, but not sure exactly how. Here's why:
[root@master1 ~]# puppet cert clean slave2.example.com Notice: Revoked certificate with serial 154 [root@master1 ~]# puppet cert clean slave2.example.com Notice: Revoked certificate with serial 154 [root@master1 ~]# puppet cert clean slave2.example.com Notice: Revoked certificate with serial 154 [root@master1 ~]# puppet cert clean slave2.example.com Notice: Revoked certificate with serial 154 [root@master1 ~]# puppet cert clean slave2.example.com Notice: Revoked certificate with serial 154
Shouldn't that fail after the first time, because of the cert no longer being there?
-
c4f4t0r over 4 yearstry find /var/lib/puppet/ssl/ -type f -name '*slave2*' on master and on slave too
-
Glenn Lasher over 4 years@c4f4t0r Thanks for the suggestion. There were no results on master1. On slave2, I found the certs, but I have already (at someone else's suggestion) tried deleting that whole file tree (/var/lib/puppet/ssl) on slave2 to make it start over. It did not get me anwhere.
-
Glenn Lasher over 4 yearsThank you for this suggestion. I did try this already yesterday, however, I ran it again just now to confirm. The response was "Nothing was deleted".
-
Aditya Pednekar over 4 yearsMore things to try: # On Client: (Get the proper certname from client just to be sure using below cmd: )
puppet config print certname
# On master:puppet node deactivate <certname>
puppet node purge <certname>
--- Also make sure the certname is set in puppet config ascertname = slave2.example.com
and then try running puppet:puppet agent -tv
-- just some basics that is easy to miss out while cloning machines. :)