Puppet cert clean not working

5,137

A Redditor provided a detailed step-by-step that seems to have resolved the issue. The three key things were:

  • Make sure that the clocks are in sync (they were)
  • Make sure that the puppet service is stopped (it wasn't)
  • Invoke puppet agent with an explicitly defined target for the master, e.g.
puppet agent -t --server master1.example.com

That combination of things got us past the cert issue.

Share:
5,137

Related videos on Youtube

Glenn  Lasher
Author by

Glenn Lasher

UNIX System Engineer for Excelsior College in Albany, NY. Linux user for 20+ years.

Updated on September 18, 2022

Comments

  • Glenn  Lasher
    Glenn Lasher over 1 year

    So, I am standing up a new server to replace an existing one. Should be easy, right? Revoke the old cert, create a new one and off you go. Here's the loop I am stuck in:

    I've redacted the server names, cert fingerprint and domain. The servers shown below are:

    • Slave1 -- The machine that will be the partner of the one that is having issues. It is only mentioned below to prove one of the details.
    • Slave2 -- The machine that is giving me issues.
    • Master1 -- The puppet master (obviously)

    On new build

    [root@slave2 ~]# puppet agent -t
    Error: Could not request certificate: The certificate retrieved from the master does not match the agent's private key.
    Certificate fingerprint: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:2F:F1
    To fix this, remove the certificate from both the master and the agent and then start a puppet run, which will automatically regenerate a certficate.
    On the master:
    puppet cert clean slave2.example.com
    On the agent:
      rm -f /var/lib/puppet/ssl/certs/slave2.example.com.pem
      puppet agent -t
    
    Exiting; failed to retrieve certificate and waitforcert is disabled
    

    Okay, that's predictable and fully expected because this is a new server using an old name. Now on the master:

    [root@master1 ~]# puppet cert clean slave2.example.com
    Notice: Revoked certificate with serial 154
    

    Note that there's nothing about the key files getting removed. This is because they are not there. Proof:

    [root@master1 ~]# ls /var/lib/puppet/ssl/ca/signed/slave1.example.com.pem
    /var/lib/puppet/ssl/ca/signed/slave1.example.com.pem
    [root@master1 ~]# ls /var/lib/puppet/ssl/ca/signed/slave2.example.com.pem
    ls: cannot access /var/lib/puppet/ssl/ca/signed/slave2.example.com.pem: No such file or directory
    

    Okay, good. Now go back to the slave to complete the procedure by removing the .pem file and running puppet agent again:

    [root@slave2 ~]# rm -f /var/lib/puppet/ssl/certs/slave2.example.com.pem
    [root@slave2 ~]# puppet agent -t
    Info: Caching certificate for slave2.example.com
    Error: Could not request certificate: The certificate retrieved from the master does not match the agent's private key.
    Certificate fingerprint: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:2F:F1
    To fix this, remove the certificate from both the master and the agent and then start a puppet run, which will automatically regenerate a certficate.
    On the master:
      puppet cert clean slave2.example.com
    On the agent:
      rm -f /var/lib/puppet/ssl/certs/slave2.example.com.pem
      puppet agent -t
    
    Exiting; failed to retrieve certificate and waitforcert is disabled
    

    ...and we are right back where we started with no change in outcome.

    One last sanity check:

    [root@master1 ~]# puppet cert list -a | grep -i slave2
    

    ...and there are no matches.

    What am I doing wrong?

    Addendum:

    I'm inclined to believe that it is on the master, but not sure exactly how. Here's why:

    [root@master1 ~]# puppet cert clean slave2.example.com
    Notice: Revoked certificate with serial 154
    [root@master1 ~]# puppet cert clean slave2.example.com
    Notice: Revoked certificate with serial 154
    [root@master1 ~]# puppet cert clean slave2.example.com
    Notice: Revoked certificate with serial 154
    [root@master1 ~]# puppet cert clean slave2.example.com
    Notice: Revoked certificate with serial 154
    [root@master1 ~]# puppet cert clean slave2.example.com
    Notice: Revoked certificate with serial 154
    

    Shouldn't that fail after the first time, because of the cert no longer being there?

    • c4f4t0r
      c4f4t0r over 4 years
      try find /var/lib/puppet/ssl/ -type f -name '*slave2*' on master and on slave too
    • Glenn  Lasher
      Glenn Lasher over 4 years
      @c4f4t0r Thanks for the suggestion. There were no results on master1. On slave2, I found the certs, but I have already (at someone else's suggestion) tried deleting that whole file tree (/var/lib/puppet/ssl) on slave2 to make it start over. It did not get me anwhere.
  • Glenn  Lasher
    Glenn Lasher over 4 years
    Thank you for this suggestion. I did try this already yesterday, however, I ran it again just now to confirm. The response was "Nothing was deleted".
  • Aditya Pednekar
    Aditya Pednekar over 4 years
    More things to try: # On Client: (Get the proper certname from client just to be sure using below cmd: ) puppet config print certname # On master: puppet node deactivate <certname> puppet node purge <certname> --- Also make sure the certname is set in puppet config as certname = slave2.example.com and then try running puppet: puppet agent -tv -- just some basics that is easy to miss out while cloning machines. :)