ipa users cannot sudo on some machines only, including the ipa server

5,691

This seems to be related to the following bug: https://fedorahosted.org/sssd/ticket/1108

and

https://bugzilla.redhat.com/show_bug.cgi?id=1256849

the suggestion there to enable the compat tree seems to have done the trick, after allowing about 30 minutes for the sssd caching to expire across the network.

In short, you need to ensure the compat tree is enabled in ipa for sssd to cache the sudo rules correctly. I had the compat tree off on the one 'broken' ipa server, and when clients were talking to that particular ipa server (via DNS SRV record) they were not caching any sudo rules. This manifested by machines sometimes being able to let users sudo, and sometimes not. Because the ipa server itsself doesn't use the SRV record but instead uses itsself, sudo on the ipa server was always broken.

Running 'ipa-compat-manage enable' on the ipa server and waiting for the sssd cache to expire seems to have fixed the problem.

Share:
5,691

Related videos on Youtube

Sirex
Author by

Sirex

A self-loathing despair squid.

Updated on September 18, 2022

Comments

  • Sirex
    Sirex almost 2 years

    I'm having trouble with freeipa on a few machines. It's been very frustrating to debug so far. Here's the details of the issue;

    How it manifests:

    The user can login just fine to any host, but on some hosts they can't run sudo commands.

    What i know:

    There's an IPA sudo policy which is 'allow this user to run any command on any host', and also a HBAC policy of 'allow this user to use any service on any host' so i think i can rule out IPA policy being a problem.

    This only seems to affect machines when they contact one particular ipa server (via the dns srv record), according to tcpdump, which i determined by flushing the sss_cache and doing sudo -k. One of the machines in question is actually that ipa server itself, so I've ruled out networking/firewalls being the cause. I'm pretty certain it's limited to just that ipa server, and clients using that particular ipa server.

    Focusing on just that ipa server itsself, and comparing it to one of my other ipa servers the sudo.conf, sudoers, sssd.conf are identical (minus debugging being added to the broken one). Both have their LAN ip in /etc/hosts, and both use ntpd (which i think rules out kerberos timing issues). Aside from turning on debugging the sssd.conf and sudo.conf files are untouched from clean install.

    The broken ipa server was the first one installed, so it's the master ca etc.

    Sudo on the machine in question (i'm focusing on the broken ipa server itsself for simplicity) does work for users which are locally defined in the /etc/sudoers file, /etc/passwd, etc.

    Details:

    All the machines are using centos 7, and ipa 4.2.0

    Logs: (domain name and user sanitized)

    =-=-=- from end of sssd logs on server1 =-=-=-
    (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [set_server_common_status] (0x0100): Marking server 'server1.domain.com' as 'working'
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 0, <NULL>) [Success (Success)]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Sending result [0][domain.com]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Sent result [0][domain.com]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler] (0x0100): Got request with the following data
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): command: PAM_ACCT_MGMT
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): domain: domain.com
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): user: sirex
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): service: sudo
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): tty: /dev/pts/2
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): ruser: sirex
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): rhost: 
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): authtok type: 0
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): newauthtok type: 0
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): priv: 0
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): cli_pid: 13960
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [pam_print_data] (0x0100): logon name: not set
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [ipa_hostgroup_info_done] (0x0200): Dereferenced host group: ipa-servers
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [hbac_get_category] (0x0200): Category is set to 'all'.
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [hbac_get_category] (0x0200): Category is set to 'all'.
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [hbac_get_category] (0x0200): Category is set to 'all'.
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [ipa_hbac_evaluate_rules] (0x0080): Access granted by HBAC rule [allow ops to anything]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 0, <NULL>) [Success (Success)]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [child_sig_handler] (0x0100): child [13965] finished successfully.
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 0, Success) [Success (Success)]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Sending result [0][domain.com]
        (Tue Jun 28 23:21:33 2016) [sssd[be[domain.com]]] [be_pam_handler_callback] (0x0100): Sent result [0][domain.com]
    
    
    =-=-=- sudo debugging output on server1 from password prompt to failure =-=-=-
    
    Jun 28 21:59:07 sudo[9738] <- expand_prompt @ ./check.c:398 := [sudo] password for sirex: 
    Jun 28 21:59:07 sudo[9738] -> verify_user @ ./auth/sudo_auth.c:193
    Jun 28 21:59:07 sudo[9738] -> sudo_pam_verify @ ./auth/pam.c:131
    Jun 28 21:59:07 sudo[9738] -> converse @ ./auth/pam.c:305
    Jun 28 21:59:07 sudo[9738] -> auth_getpass @ ./auth/sudo_auth.c:347
    Jun 28 21:59:07 sudo[9738] -> tgetpass @ ./tgetpass.c:76
    Jun 28 21:59:07 sudo[9738] -> tty_present @ ./tgetpass.c:329
    Jun 28 21:59:07 sudo[9738] <- tty_present @ ./tgetpass.c:333 := true
    Jun 28 21:59:07 sudo[9738] -> term_noecho @ ./term.c:88
    Jun 28 21:59:07 sudo[9738] <- term_noecho @ ./term.c:99 := 1
    Jun 28 21:59:07 sudo[9738] -> getln @ ./tgetpass.c:272
    Jun 28 21:59:09 sudo[9738] <- getln @ ./tgetpass.c:315 := **********
    Jun 28 21:59:09 sudo[9738] -> term_restore @ ./term.c:73
    Jun 28 21:59:09 sudo[9738] <- term_restore @ ./term.c:82 := 1
    Jun 28 21:59:09 sudo[9738] <- tgetpass @ ./tgetpass.c:202 := **********
    Jun 28 21:59:09 sudo[9738] <- auth_getpass @ ./auth/sudo_auth.c:365 := **********
    Jun 28 21:59:09 sudo[9738] <- converse @ ./auth/pam.c:387 := 0
    Jun 28 21:59:09 sudo[9738] <- sudo_pam_verify @ ./auth/pam.c:142 := 0
    Jun 28 21:59:09 sudo[9738] <- verify_user @ ./auth/sudo_auth.c:282 := 1
    Jun 28 21:59:09 sudo[9738] -> sudo_auth_cleanup @ ./auth/sudo_auth.c:160
    Jun 28 21:59:09 sudo[9738] -> sudo_pam_cleanup @ ./auth/pam.c:189
    Jun 28 21:59:09 sudo[9738] <- sudo_pam_cleanup @ ./auth/pam.c:193 := 0
    Jun 28 21:59:09 sudo[9738] <- sudo_auth_cleanup @ ./auth/sudo_auth.c:177 := 0
    Jun 28 21:59:09 sudo[9738] -> sudo_pw_delref @ ./pwutil.c:249
    Jun 28 21:59:09 sudo[9738] -> sudo_pw_delref_item @ ./pwutil.c:238
    Jun 28 21:59:09 sudo[9738] <- sudo_pw_delref_item @ ./pwutil.c:243
    Jun 28 21:59:09 sudo[9738] <- sudo_pw_delref @ ./pwutil.c:251
    Jun 28 21:59:09 sudo[9738] <- check_user @ ./check.c:189 := true
    Jun 28 21:59:09 sudo[9738] -> log_failure @ ./logging.c:318
    Jun 28 21:59:09 sudo[9738] -> log_denial @ ./logging.c:256
    Jun 28 21:59:09 sudo[9738] -> audit_failure @ ./audit.c:68
    Jun 28 21:59:09 sudo[9738] -> linux_audit_command @ ./linux_audit.c:70
    Jun 28 21:59:09 sudo[9738] -> linux_audit_open @ ./linux_audit.c:49
    Jun 28 21:59:09 sudo[9738] <- linux_audit_open @ ./linux_audit.c:61 := 13
    Jun 28 21:59:09 sudo[9738] <- linux_audit_command @ ./linux_audit.c:97 := 3
    Jun 28 21:59:09 sudo[9738] <- audit_failure @ ./audit.c:81
    Jun 28 21:59:09 sudo[9738] -> new_logline @ ./logging.c:746
    Jun 28 21:59:09 sudo[9738] <- new_logline @ ./logging.c:867 := user NOT authorized on host ; TTY=pts/2 ; PWD=/home/sirex ; USER=root ; COMMAND=/bin/bash -l
    

    Unless i'm reading this wrongly, it looks to me like sudo is speaking with sssd, which talks to IPA via kerberos (on the same machine). That says OK, then sudo... for some reason, rejects that and says no anyway ?

    Configs: (of broken ipa server)

    =-=-=- sudo.conf (comment lines removed) =-=-=-=- 
    Debug sudo /var/log/sudo_debug all@debug
    Debug sudoers.so /var/log/sudo_debug all@debug
    Plugin sudoers_policy sudoers.so
    Plugin sudoers_io sudoers.so
    Set disable_coredump false
    
    
    =-=-=- sssd.conf (whitespace removed) =-=-=-=-
    [domain/domain.com]
    debug_level = 5
    cache_credentials = True
    krb5_store_password_if_offline = True
    ipa_domain = domain.com
    id_provider = ipa
    auth_provider = ipa
    access_provider = ipa
    ipa_hostname = server1.domain.com
    chpass_provider = ipa
    ipa_server = server1.domain.com
    ipa_server_mode = True
    ldap_tls_cacert = /etc/ipa/ca.crt
    [sssd]
    services = nss, sudo, pam, ssh
    config_file_version = 2
    domains = domain.com
    [nss]
    memcache_timeout = 600
    homedir_substring = /home
    [pam]
    [sudo]
    [autofs]
    [ssh]
    [pac]
    [ifp]
    
    =-=-==- /etc/sudoers (comments removed) =-=-=-=-=-
    Defaults   !visiblepw
    Defaults    always_set_home
    Defaults    env_reset
    Defaults    env_keep =  "COLORS DISPLAY HOSTNAME HISTSIZE INPUTRC KDEDIR LS_COLORS"
    Defaults    env_keep += "MAIL PS1 PS2 QTDIR USERNAME LANG LC_ADDRESS LC_CTYPE"
    Defaults    env_keep += "LC_COLLATE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES"
    Defaults    env_keep += "LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE"
    Defaults    env_keep += "LC_TIME LC_ALL LANGUAGE LINGUAS _XKB_CHARSET XAUTHORITY"
    Defaults    secure_path = /sbin:/bin:/usr/sbin:/usr/bin
    root    ALL=(ALL)   ALL
    %wheel  ALL=(ALL)   ALL
    

    Edit1: Ok, on suggest from jhrozek i also enabled debugging on the [sudo] section, which gives this in the logs:

    (Wed Jun 29 21:08:27 2016) [sssd[sudo]] [sudosrv_get_rules] (0x0400): Retrieving default options for [sirex] from [domain.com]
    (Wed Jun 29 21:08:27 2016) [sssd[sudo]] [sudosrv_get_sudorules_query_cache] (0x0200): Searching sysdb with [(&(objectClass=sudoRule)(|(sudoUser=ALL)(name=defaults)(sudoUser=sirex)(sudoUser=#123607)(sudoUser=%confluence-administrators)(sudoUser=%jira-administrators)(sudoUser=%build_system_shell)(sudoUser=%jira-developers)(sudoUser=%publictracker)(sudoUser=%staff)(sudoUser=%wikiprivate)(sudoUser=%jira-users)(sudoUser=%vpn_users)(sudoUser=%ipausers)(sudoUser=%admins)(sudoUser=%gerrit)(sudoUser=%sirex)(sudoUser=%wiki)(sudoUser=%ops)(sudoUser=%gerrit-submit)(sudoUser=%sirex)(sudoUser=+*))(&(dataExpireTimestamp<=1467234507)))]
    (Wed Jun 29 21:08:27 2016) [sssd[sudo]] [sudosrv_get_rules] (0x2000): About to get sudo rules from cache
    (Wed Jun 29 21:08:27 2016) [sssd[sudo]] [sudosrv_get_sudorules_query_cache] (0x0200): Searching sysdb with [(&(objectClass=sudoRule)(|(name=defaults)))]
    (Wed Jun 29 21:08:27 2016) [sssd[sudo]] [sudosrv_get_sudorules_from_cache] (0x0400): Returning 0 rules for [<default options>@domain.com]
    

    ldbsearch gives 0 results, but does also show 'asq: Unable to register control with rootdse!' (though it says that on the other servers also)

    on the broken server

    ldbsearch -H /var/lib/sss/db/cache_domain.com.ldb -b cn=sysdb '(objectClass=sudoRule)'
    

    gives 0, whereas it gives 3 on the other auth servers, so i guess it's somehow replication related, but the question now is how to fix that.

    Edit2: Oddly, on the broken server

    ipa sudorule-find All
    

    returns the 3 rules !? I've tried removing the sssd cache file on the broken server and restarting sssd, but ldbsearch still gives 0 rules found.

    If i do ldbsearch with no filter i get 48 records on the broken server and 51 on the others. Only the 3 sudo rule entries are missing.

    I made these sudo rules on one of the working ipa servers, so i'm led to believe that either replication isnt working for the sysdb table, or it's not working just for the sudo rules of it. There is a firewall between them but user creation on a working ipa server IS replicated over to the broken ipa server, so i dont know if i can rule the firewall out. It's worth noting however that while i think all the ports are allowed between them, the broken server is in a DMZ subnet so i don't allow port 22 (ssh) back from that ipa server to the others. I don't know if that matters ? I have however done the conncheck script and it said everything was either OK or warning for the two ports that were in use by ipa itself

    Edit3: OK, so making a sudo rule on the broken server which affects all servers (so it should be cached in sssd) makes the new rule show up in the UI (as well as the 3 others) but it does not show up in sssd. So it looks like sssd isn't caching the rules properly.

    I just found a file ~/.ipa/log/cli.log, which on the broken server (only) has

    2016-05-29T22:59:23Z    6583    MainThread  ipa ERROR   Certificate operation cannot be completed: Unable to communicate with CMS (Internal Server Error)
    

    I don't know if that's a red herring or a smoking gun ?

    Edit4: From the comments of Danila Ladner and subsequent testing, this seems to be occur in 4.2.0-15.0.1.el7.centos.17 but not in 4.2.0-15.0.1.el7.centos.6.1 which i believe was due to the corresponding upgrade of libsss to 1.13.0.40.el7_2.9

    I believe it's related to: https://fedorahosted.org/sssd/ticket/1108

    and

    https://bugzilla.redhat.com/show_bug.cgi?id=1256849

    but now i just need to work out a fix for it. the ipa-compat tree wasn't enabled on the 'broken' ipa server, now it is but still no joy.

    • jhrozek
      jhrozek about 8 years
      Unless I'm reading the sudo log wrongly, sudo didn't fetch any rules from sssd, at least I don't see any. What you can also do is crank up debugging in the sudo section and then sssd will tell you how many rules it's passing to sudo. Running ldbsearch on the cache would show you the cached rules. We have an upstream troubleshooting guide at fedorahosted.org/sssd/wiki/HOWTO_Troubleshoot_SUDO. Finally, since you're saying only one server misbehaves, check for replication issues and try searching the server itself to see what's in LDAP there.
    • Sirex
      Sirex about 8 years
      the server that misbehaves is the original one i setup, so it's got 2 replicas hooked off it but sudo works on both of those. enabling debugging on the [sudo] section also shows that it's returning 0 results, but i'll edit my question wit that info as it's too much for a comment.
    • Danila Ladner
      Danila Ladner almost 8 years
      What is the version of ipa client and libsss_sudo?
    • Sirex
      Sirex almost 8 years
      you might be onto something. the broken one is on 4.2.0-15.0.1.el7.centos.17.x86_64 for the server and client packages, whereas the others are on 4.2.0-15.0.1.el7.centos.6.1.x86_64. The question is do i upgrade anouther and risk it possibly also breaking, or what ;-/
    • Sirex
      Sirex almost 8 years
      ok, so i know it breaks on 4.2.0-15.0.1.el7.centos.17 and works on 4.2.0-15.0.1.el7.centos.6.1 (i upgraded one ipa server and it also now doesn't pull in the sudo rules to ldbsearch). Which smells like a bug to me. But i can't downgrade ipa (easily, anyhow, it's got all kinds of conflicts). So i wonder what the best route forward is ?