Bind master doesn't sync to slave

9,223

Solution 1

it seems to me like the issue is primarily out of bind system. This I see the most important to start with.

Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: connected using <OTHER IP, maybe FW?>#41569

In general it seems like communication is working (slave can contact master) but somehow not directly (e.g. some NAT). The result is that master see the request going from other then allowed IP and properly decline the transfer. As a working solution for simple zone transfer (notification may be other topic) I would see utilizing TSIG for the transfer so even the request would come from other than slave's IP it can be properly processed as by the Transaction SIGnature it can be properly authorized...

to generate the TSIG key you can use e.g.

a=$(dnssec-keygen -a HMAC-MD5 -b 512 -n HOST transfer); sed "s/\([^ ]*\)\. IN KEY [0-9]* [0-9]* [0-9]* \([^ ]*\) \([^ ]*\)/key \1 {\n  algorithm HMAC-MD5;\n  secret \2\3;\n};/" ${a}.key; rm ${a}*

or in case you prefer other form for better readability:

a=$(dnssec-keygen -a HMAC-MD5 -b 512 -n HOST transfer)
sed "s/\([^ ]*\)\. IN KEY [0-9]* [0-9]* [0-9]* \([^ ]*\) \([^ ]*\)/key \1 {\n  algorithm HMAC-MD5;\n  secret \2\3;\n};/" ${a}.key
rm ${a}*

The result would be the text ready to copy out to bind config:

key transfer {
  algorithm HMAC-MD5;
  secret bv2uLjmxx2RA9DGTP697E17//s6xxt9DgjFxYpVv53qvsHdqG3Fy8IXva/OaEaHHHVuquh23mCIIQ2Gf3ojqzw==;
};

This "block" has to be copy to both master and slave config to be known and the same ;-).

Then you can change config on MASTER side from

    allow-transfer {
            <IP-OF-SLAVE>;
            };

to

    allow-transfer {
            key transfer;
            };

and on slave side from

    masters {
            <IP-MASTER>;
            };

to

    masters {
            <IP-MASTER> key transfer;
            };

This way the slave will contact the master using the key and even the source IP would change the transaction will be allowed based on proper TSIG. The allowance for transfer will be set not based on source IP of request but based on "transfer" key for TSIG.

Next step would / may be to investigate WHY the source IP is changing but the transfer will work already at that moment ;-). Good Luck !

-- edit -- I have added forgotten semicolon in part with the key. It could be clear with error message during the load but to have it complete.... :-)

Solution 2

The SOA headers you posted from both the slave and the master show identical serial numbers.

The typical "newbie" mistake when master-slave replication "suddenly stops working" is that they don't update SOA serial number record when they make a modification in the zone file on the master.

When you don't increment the SOA serial number record on the master the slave(s) can't detect that it is out-of-sync and won't request a zone transfer from the master name server.

Bind on the master will need to reload the zone file after it has been modified for any changes to take effect, for instance with: rnds reload example.com

If that still doesn't work: the comment from @wurtel has some good pointers for further debugging.


The SOA serial string 1373899259 looks like a timestamp:

date --date=" 1970-01-01 00:00:00 UTC +1373899259 seconds"
Mon Jul 15 16:40:59 CEST 2013

and should be incremented with at least +1 to 1373899260, although arguably using the current time stamp might be "better":

date +%s
1560943969
Share:
9,223

Related videos on Youtube

lizlin
Author by

lizlin

An operating technician who mostly script in order to automate everything that I find tedious to do manually.

Updated on September 18, 2022

Comments

  • lizlin
    lizlin over 1 year

    I got handed an old Bind system at work, and the zones on the master doesn't sync to the slave. I'm a noob at bind, and could really use the help. I would like all changes made on the MASTER to sync over to the SLAVE.

    The servers can reach each other (ping, ssh, fully open in between). The servers are a bit old, I'm not allowed to update due to fear that things might break.

    Ubuntu 12.04.5 LTS BIND 9.8.1-P1

    MASTER = ns1..com. SLAVE = ns2..com.

    We can use the bind-servers, they function as they should, the changes just don’t replicate.

    Most of the changes are said to have been made through a gui, I have no access to this.

    The issues might have started during a change of ip on the MASTER server, it was at least then the issues were discovered, but no-one knows with certainty.

    Have restarted services, flushed cache, restarted servers. I checked the config, but from what I can see it should be correct. Tried rndc --retransfer , but it gives no output and doesn’t work.

    rndc status
    gives the following output:

    version: 9.8.1-P1
    CPUs found: 1
    worker threads: 1
    number of zones: 296
    debug level: 0
    xfers running: 0
    xfers deferred: 0
    soa queries in progress: 0
    query logging is OFF
    recursive clients: 0/0/1000
    tcp clients: 0/100
    server is up and running
    

    MASTER and SLAVE (config alike, only secret is different)
    /etc/bind/named.conf

    // This is the primary configuration file for the BIND DNS server named.
    //
    // Please read /usr/share/doc/bind9/README.Debian.gz for information on the
    // structure of BIND configuration files in Debian, *BEFORE* you customize
    // this configuration file.
    //
    // If you are just adding zones, please do that in /etc/bind/named.conf.local
    
    include "/etc/bind/named.conf.options";
    include "/etc/bind/named.conf.local";
    include "/etc/bind/named.conf.default-zones";
    key rndc-key {
            algorithm hmac-md5;
            secret "UHSoHPGEh+p5kIdoGzoX0A==";
            };
    controls {
            inet 127.0.0.1 port 953 allow { 127.0.0.1; } keys { rndc-key; };
            };
    


    MASTER
    /etc/bind/named.conf.options

    options {
            directory "/var/cache/bind";
    
            // If there is a firewall between you and nameservers you want
            // to talk to, you may need to fix the firewall to allow multiple
            // ports to talk.  See http://www.kb.cert.org/vuls/id/800113
    
            // If your ISP provided one or more IP addresses for stable
            // nameservers, you probably want to use them as forwarders.
            // Uncomment the following block, and insert the addresses replacing
            // the all-0's placeholder.
    
            // forwarders {
            //      0.0.0.0;
            // };
    
            //========================================================================
            // If BIND logs error messages about the root key being expired,
            // you will need to update your keys.  See https://www.isc.org/bind-keys
            //========================================================================
            dnssec-validation auto;
    
            auth-nxdomain yes;
            listen-on-v6 { any; };
            recursion no;
            multiple-cnames yes;
            fetch-glue yes;
            check-names master fail;
            check-names slave fail;
            allow-transfer { localhost; <IP-OF-SLAVE>; };
            notify yes;
            dump-file "/";
            also-notify {
                    };
    };
    


    SLAVE
    /etc/bind/named.conf.options

    options {
            directory "/var/cache/bind";
    
            // If there is a firewall between you and nameservers you want
            // to talk to, you may need to fix the firewall to allow multiple
            // ports to talk.  See http://www.kb.cert.org/vuls/id/800113
    
            // If your ISP provided one or more IP addresses for stable
            // nameservers, you probably want to use them as forwarders.
            // Uncomment the following block, and insert the addresses replacing
            // the all-0's placeholder.
    
            // forwarders {
            //      0.0.0.0;
            // };
    
            //========================================================================
            // If BIND logs error messages about the root key being expired,
            // you will need to update your keys.  See https://www.isc.org/bind-keys
            //========================================================================
            dnssec-validation auto;
    
            auth-nxdomain yes;
            listen-on-v6 { any; };
            recursion no;
            multiple-cnames yes;
            fetch-glue yes;
            allow-transfer { <MASTER IP>; };
            //allow-transfer { ns1.<our-domain>.com; };
            //also-notify {};
    };
    
    

    MASTER
    /etc/bind/named.conf.local

    //
    // Do any local configuration here
    //
    
    // Consider adding the 1918 zones here, if they are not used in your
    // organization
    //include "/etc/bind/zones.rfc1918";
    
    zone "domain.nu" {
            type master;
            file "/var/lib/bind/<DOMAIN>.nu.hosts";
            allow-transfer {
                    <IP-OF-SLAVE>;
                    };
            };
    

    There are hundreds of zones here, all configured alike.

    SLAVE
    /etc/bind/named.conf.local

    zone "domain.nu" {
            type slave;
            masters {
                    <IP-MASTER>;
                    };
            file "/var/lib/bind/domain.nu.hosts";
            allow-transfer {
                   <IP-MASTER>;
                    };
            };
    
    


    MASTER
    /etc/bind/named.conf.default-zones

    // prime the server with knowledge of the root servers
    zone "." {
            type hint;
            file "/etc/bind/db.root";
    };
    
    // be authoritative for the localhost forward and reverse zones, and for
    // broadcast zones as per RFC 1912
    
    zone "localhost" {
            type master;
            file "/etc/bind/db.local";
    };
    
    zone "127.in-addr.arpa" {
            type master;
            file "/etc/bind/db.127";
    };
    
    zone "0.in-addr.arpa" {
            type master;
            file "/etc/bind/db.0";
    };
    
    zone "255.in-addr.arpa" {
            type master;
            file "/etc/bind/db.255";
    };
    

    SLAVE
    /etc/bind/named.conf.default-zones

    // prime the server with knowledge of the root servers
    zone "." {
            type hint;
            file "/etc/bind/db.root";
    };
    
    // be authoritative for the localhost forward and reverse zones, and for
    // broadcast zones as per RFC 1912
    
    zone "localhost" {
            type master;
            file "/etc/bind/db.local";
    };
    
    zone "127.in-addr.arpa" {
            type master;
            file "/etc/bind/db.127";
    };
    
    zone "0.in-addr.arpa" {
            type master;
            file "/etc/bind/db.0";
    };
    
    zone "255.in-addr.arpa" {
            type master;
            file "/etc/bind/db.255";
    };
    
    



    Apart from this config, we can find the config for the different zones in /var/lib/bind/.hosts They look a bit different depending on if they’re on the MASTER or on the SLAVE


    MASTER

    /var/lib/bind/.hosts

    $ttl 38400
    domain.com.    IN      SOA     ns1.<our domain>.com. admin.<our domain>.com.. (
                            1373899259
                            7200
                            3600
                            604800
                            38400 )
    <domain.com>.    IN      NS      ns1.<our domain>.com.
    <domain.com>.    IN      NS      ns2.<our domain>.com.
    <domain.com>.    IN      A       <customer ip>
    www.<domain.com>.        IN      A       <customer ip>
    _autodiscover._tcp.domain.com. IN      SRV     0 0 443  autodiscover.<our-domain>.com.
    <domain.com>.    IN      MX      10 <mx-record>.com.
    <domain.com>.    IN      MX      20 <mx-record>.net.
    


    SLAVE

    /var/lib/bind/some-domain.com.hosts

    $ORIGIN .
    $TTL 38400      ; 10 hours 40 minutes
    domain.com             IN SOA  ns1.<our domain>.se. admin.<our domain>.com. (
                                    1373899259 ; serial
                                    7200       ; refresh (2 hours)
                                    3600       ; retry (1 hour)
                                    604800     ; expire (1 week)
                                    38400      ; minimum (10 hours 40 minutes)
                                    )
                            NS      ns1.<our domain>.com.
                            NS      ns2.<our domain>.com.
                            A       212.247.229.60
                            MX      10 <mx>.com.
                            MX      20 <mx>.net.
    
    $ORIGIN <DOMAIN.COM>.
    
    _autodiscover._tcp      SRV     0 0 443 autodiscover.<our-domain>.com.
    www                     A       <customer ip>
    
    
    



    EDIT:

    I checked the logs, when I run
    rndc reload
    on the SLAVE, the syslog fills up with this for different zones:

    Jun 19 13:54:22 ns2 named[3558]: zone <domain.com>/IN: Transfer started.
    Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: connected using <OTHER IP, maybe FW?>#41569
    Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: failed while receiving responses: NOTAUTH
    Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: Transfer completed: 0 messages, 0 records, 0 bytes, 0.001 secs (0 bytes/sec)
    
    Jun 19 13:53:49 ns2 named[3558]: zone <DOMAIN.COM>/IN: refresh: unexpected rcode (REFUSED) from master <MASTER IP>#53 (source 0.0.0.0#0)
    Jun 19 13:53:49 ns2 named[3558]: zone <DOMAIN.COM>/IN: Transfer started.
    
    

    On the MASTER, the syslog looks like this:

    Jun 19 16:42:36 ns1 named[12833]: client <SLAVE IP>#15012: query (cache) '<domain.com>/SOA/IN' denied
    Jun 19 16:42:36 ns1 named[12833]: client <SLAVE IP>#58925: zone transfer '<DOMAIN.COM>/AXFR/IN' denied
    Jun 19 16:42:36 ns1 named[12833]: client <SLAVE IP>#56767: bad zone transfer request: '<DOMAIN.COM>/IN': non-authoritative zone (NOTAUTH)
    

    All of these logs repeat for different domains

    • wurtel
      wurtel almost 5 years
      Normally when the zone is reloaded on the primary NS, it will send a notification to all nameservers listed in the zone itself. In a normal bind installation the slave will do an AXFR to transfer the zone from the master, but apparently this is not working. Why this is not working should be should in the syslog. You can try removing the cached zone on the slave and restarting named; it will then try to transfer from the master, irrespective of notifications or whatever. Again, check the logs, and add any relevant log lines to your question.
    • lizlin
      lizlin almost 5 years
      I've added the syslog to the original post now, on the bottom.
    • wurtel
      wurtel almost 5 years
      The "NOTAUTH" and "REFUSED" in the log on the slave would point to "allow-transfer" on the master not being correct. Anything in the log of the master NS? There should be lines like client x.x.x.x#99999 (example.com): transfer of 'example.com/IN' denied; that x.x.x.x IP address should be listed in allow-transfer
    • lizlin
      lizlin almost 5 years
      I completely agree, but I can't seem to find the exact issue, or how to fix it. Added logs for the MASTER as well now, at the bottom of my case. The ip listed there is the SLAVE ip, the one that's already listed in the MASTER /etc/bind/named.conf.options file already.
    • wurtel
      wurtel almost 5 years
      In your config you show "domain.nu", but the log errors are about "domain.com", and especially "non-authoritative zone", which means that the "primary" is saying that it is neither primary nor slave for that domain!
    • lizlin
      lizlin almost 5 years
      That's just me who fubbed up the find-and-replace, unfortunately. Needed to clean the files of traceble customer data :) The zones are a mix of .nu, ,org, .bix, .com etc. The logs are just a sample, there are hundreds of domains on these servers, but the error messages repeats for all.
  • lizlin
    lizlin almost 5 years
    All of the old changes has been made through a GUI, but I tried it and edited /var/lib/bind/<doman>.hosts on the MASTER, upped the number with +1 from the original. Did an rndc reload and tried an rndc -- retransfer <domain>, but the number (and the record) did not update on the SLAVE. Also added the logs to my case that was suggested by @wurtel
  • lizlin
    lizlin almost 5 years
    Thank you - this solved it! Gonna replace the Bind-servers after summer, so no more troubleshooting atm :)
  • Kamil J
    Kamil J almost 5 years
    I happy that it was helpful. In case it is all please accept the solution to let other people know that the issue is not open (let say "mandatory" part of comment :) ). Next to it think about mark answer as "valuable" in case you think so - this is optional part of comment ;-).
  • lizlin
    lizlin almost 5 years
    I did, but It says that since I have less than 15 reputation, it's recorded but not displayed.