keepalived - random re-elections

5,455

The problem is that you use the default state MASTER for the backup nodes. They should state BACKUP.

  vrrp_instance VIP_61 {
      interface bond0
      virtual_router_id 61
      state BACKUP
      priority 98
      ...

Hope this solves your mistery.

Share:
5,455

Related videos on Youtube

milosgajdos
Author by

milosgajdos

Go, Docker, Kubernetes, Machine Learning

Updated on September 18, 2022

Comments

  • milosgajdos
    milosgajdos almost 2 years

    We have set up 3 servers running keepalived . We started noticing some random re-elections occurring which we can't explain so I cam here looking for advice.

    Here is our configuration:

    MASTER:

    global_defs {
      notification_email {
        [email protected]
      }
      notification_email_from keepalived@hostname
      smtp_server example.com:587
      smtp_connect_timeout 30
      router_id some_rate
    }
    
    
    vrrp_script chk_nginx {
      script "killall -0 nginx"
      interval 2
      weight 2
    }
    
    vrrp_instance VIP_61 {
      interface bond0
      virtual_router_id 61
      state MASTER
      priority 100
      advert_int 1
      authentication {
        auth_type PASS
        auth_pass PASSWORD
      }
      virtual_ipaddress {
        X.X.X.X
        X.X.X.X
        X.X.X.X
      }
      track_script {
        chk_nginx
      }
    }
    

    BACKUP1:

    global_defs {
      notification_email {
        [email protected]
      }
      notification_email_from keepalived@hostname
      smtp_server example.com:587
      smtp_connect_timeout 30
      router_id some_rate
    }
    
    
    vrrp_script chk_nginx {
      script "killall -0 nginx"
      interval 2
      weight 2
    }
    
    vrrp_instance VIP_61 {
      interface bond0
      virtual_router_id 61
      state MASTER
      priority 99
      advert_int 1
      authentication {
        auth_type PASS
        auth_pass PASSWORD
      }
      virtual_ipaddress {
        X.X.X.X
        X.X.X.X
        X.X.X.X
      }
      track_script {
        chk_nginx
      }
    }
    

    BACKUP2:

        global_defs {
          notification_email {
            [email protected]
          }
          notification_email_from keepalived@hostname
          smtp_server example.com:587
          smtp_connect_timeout 30
          router_id some_rate
        }
    
    
    vrrp_script chk_nginx {
      script "killall -0 nginx"
      interval 2
      weight 2
    }
    
    vrrp_instance VIP_61 {
      interface bond0
      virtual_router_id 61
      state MASTER
      priority 98
      advert_int 1
      authentication {
        auth_type PASS
        auth_pass PASSWORD
      }
      virtual_ipaddress {
        X.X.X.X
        X.X.X.X
        X.X.X.X
      }
      track_script {
        chk_nginx
      }
    }
    

    Every now and then I can see this happening (grepped in logs):

    MASTER:

    Jan  6 18:30:15 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
    Jan  6 18:30:16 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
    Jan  6 18:32:37 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
    

    BACKUP1:

    Jan  6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE
    Jan  6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert
    Jan  6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
    Jan  6 18:32:37 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) forcing a new MASTER election
    Jan  6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE
    Jan  6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert
    Jan  6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
    

    BACKUP2:

    Jan  6 18:32:36 lb-public03 Keepalived_vrrp[14255]: VRRP_Script(chk_nginx) succeeded
    Jan  6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Transition to MASTER STATE
    Jan  6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Received higher prio advert
    Jan  6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Entering BACKUP STATE
    

    So MASTER receives LOWER PRIO advert and NEW election is started. WHY ? Looks like BACKUP transitions into MASTER for a short period of time (based on the logs) and then fails back to the BACKUP state. I'm quite clueless as why is this actually happening so any hints would be more than welcome.

    Also, I found out there is a unicast patch in keepalived, however it's not clear to me whether it supports more than 1 unicast peer - in our case we have a cluster of 3 machines so we need more than 1 unicast peers.

    Any hints on these issues would be superamazingly appreciated!

    • milosgajdos
      milosgajdos over 10 years
      Turns out we were running old version of keepalived. We have upgraded to the latest available and now all seems to be good.
    • Jacob Evans
      Jacob Evans about 7 years
      do you use vmware snapshots....