Why does SSH report "reverse mapping checking getaddrinfo failed" even when PTR record is set?

5,630

It was all about Avahi and the .local domain and nothing to do with PTR records.

I did a bunch more searching having realised that resolution of the host worked, but that host by FQDN was failing. This eventually led me https://superuser.com/questions/704785/ping-cant-resolve-hostname-but-nslookup-can and from it, I was linked to http://www.lowlevelmanager.com/2011/09/fix-linux-dns-issues-with-local.html which solved everything for me.

Ultimately the problem is that in /etc/nsswitch.conf there's a line that says:
hosts: files mdns4_minimal [NOTFOUND=return] dns
By changing this to read:
hosts: files dns
The problem disappeared and I no longer got the error about possible break-in attempts.

Another solution I tested was simply to rename the domain, since this behaviour is specific to the .local domain. By renaming cluster.local to cluster.bob, the error message also disappeared.

Another solution would be to move Avahi from .local to something like .alocal so that the multicast DNS doesn't apply to the .local domain and the default nsswitch configuration would seem to work. I suppose removing the [NOTFOUND=return] parameter would also work as it would stop multicast DNS from ending the lookup if a .local host wasn't found, however that's probably a bad idea.

Ultimately this was an edge case that came about because I didn't fully understand the significance of the .local domain, I just viewed it as a good convention for an internal network.

Share:
5,630

Related videos on Youtube

denizs
Author by

denizs

Updated on September 18, 2022

Comments

  • denizs
    denizs over 1 year

    I'm trying to setup a cluster using a private network on subnet 10. One machine has two interfaces, one to connect to the regular network and the other to connect to all the nodes on subnet 10. This CentOS 6 machine (let's call it "zaza.domain.com") runs DHCP, DNS and currently both of these are managed by Cobbler, which may or may not be part of the problem (although disabling it and doing everything manually still gives me problems).

    If I SSH into zaza, and then try to SSH from zaza into node1, I get a warning message like follows:

    [root@zaza ~]# ssh node1
    reverse mapping checking getaddrinfo for node1.cluster.local [10.69.0.1] failed - POSSIBLE BREAK-IN ATTEMPT! 
    

    I still get a password prompt and can still login OK.

    I know from sshd warning, "POSSIBLE BREAK-IN ATTEMPT!" for failed reverse DNS and "POSSIBLE BREAK-IN ATTEMPT!" in /var/log/secure — what does this mean? and a bunch of other searching that the cause of this error typically is a PTR record not being set. However, it is set - consider the following:

    [root@zaza ~]# nslookup node1.cluster.local   
    Server:     10.69.0.69   
    Address:    10.69.0.69#53
    
    Name:   node1.cluster.local   
    Address: 10.69.0.1
    
    [root@zaza ~]# nslookup 10.69.0.1   
    Server:     10.69.0.69   
    Address:    10.69.0.69#53
    
    1.0.69.10.in-addr.arpa  name = node1.cluster.local.
    

    The 10.69.0.69 IP address is second interface of zaza.

    If I try a different tool like dig, to actually view the PTR record I get the following output:

    [root@zaza ~]# dig ptr 1.0.69.10.in-addr.arpa    
    ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.el6_8.4 <<>> ptr 69.0.69.10.in-addr.arpa
    ;; global options: +cmd
    ;; Got answer:   
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29499   
    ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
    
    ;; QUESTION SECTION:
    ;1.0.69.10.in-addr.arpa.    IN  PTR
    
    ;; ANSWER SECTION:  
    1.0.69.10.in-addr.arpa. 300 IN  PTR node1.cluster.local.
    
    ;; AUTHORITY SECTION:  
    10.in-addr.arpa.    300 IN  NS  zaza.cluster.local.
    
    ;; ADDITIONAL SECTION:   zaza.cluster.local.    300 IN  A   10.69.0.69
    
    ;; Query time: 0 msec
    ;; SERVER: 10.69.0.69#53(10.69.0.69)
    ;; WHEN: Wed Mar  1 17:05:44 2017   
    ;; MSG SIZE  rcvd: 110
    

    It looks to me like the PTR record is set, so I don't know why SSH would throw a hissy fit when I try to connect to one of the node machines. To give all the information, here's the relevant config files, spoilered to make things look just a tad more readable...

    /etc/named.conf

    [root@zaza ~]# cat /etc/named.conf 
    options {
              listen-on port 53 { any; };
              directory       "/var/named";
              dump-file       "/var/named/data/cache_dump.db";
              statistics-file "/var/named/data/named_stats.txt";
              memstatistics-file "/var/named/data/named_mem_stats.txt";
              allow-query     { any; }; # was localhost
              recursion yes;
    
              # setup DNS forwarding
              forwarders {1.2.3.4;}; # Real IP goes in here
    };
    
    logging {
            channel default_debug {
                    file "data/named.run";
                    severity dynamic;
            };
    };
    
    zone "cluster.local." {
        type master;
        file "cluster.local";
    
        # these two lines allow DNS querying
        allow-update { any; };
        notify no;
    };
    
    zone "10.in-addr.arpa." {
        type master;
        file "10";
    
        # these two lines allow DNS querying
        allow-update { any; };
        notify no;
    };
    

    /var/named/cluster.local

    [root@zaza ~]# cat /var/named/cluster.local 
    $TTL 300
    @                       IN      SOA     zaza.cluster.local. nobody.example.com. (
                                            2017030100   ; Serial
                                            600         ; Refresh
                                            1800         ; Retry
                                            604800       ; Expire
                                            300          ; TTL
                                            )
    
                            IN      NS      zaza.cluster.local.
    
    zaza     IN  A     10.69.0.69
    
    
    
    node1  IN  A     10.69.0.1;
    node2  IN  A     10.69.0.2;
    

    /var/named/10

    [root@zaza ~]# cat /var/named/10 
    $TTL 300
    @                       IN      SOA     zaza.cluster.local. root.zaza.cluster.local. (
                                            2017030100   ; Serial
                                            600         ; Refresh
                                            1800         ; Retry
                                            604800       ; Expire
                                            300          ; TTL
                                            )
    
                            IN      NS      zaza.cluster.local.
    
    69.0.69 IN  PTR  zaza.cluster.local.
    
    
    
    1.0.69  IN  PTR  node1.cluster.local.
    2.0.69  IN  PTR  node2.cluster.local.
    

    If you have any ideas, it'd be much appreciated!

    • Andrew B
      Andrew B about 7 years
      On zaza, what output do you see when you run getent hosts node1.cluster.local?
    • denizs
      denizs about 7 years
      I saw none, which surprised me. But if I did getent hosts node1 I did get a result of 10.69.0.1 node1.cluster.local. From this I realised the problem was something to do with /etc/nsswitch.conf and I've posted the whole answer below. Cheers for pointing me in the right direction.