Zookeeper sessions keep expiring...no heartbeats?

14,985

Solution 1

Often ZooKeeper session timeouts are caused by "soft failures," which are most commonly a garbage collection pause. Turn on GC logging and see if a long GC occurs at the time the connection times out. Also, read about JVM tuning in Kafka.

Solution 2

[2016-03-08 17:46:56,000] INFO Expiring session 0x153175bd3860151, timeout of 4000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)

What is Zookeeper's maxSessionTimeout? If it's just 4000ms (4 seconds), then it's way too small.

In Cloudera distribution of Hadoop, ZK's maxSessionTimeout is by default 40s (40000ms).

As explained in ZK configuration - https://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html it defaults 20 ticks (and one tick by default is 2 seconds).

Share:
14,985
nikel
Author by

nikel

Another guy who like programming:P...B.Tech,Computer Science.I believe that Programming is the art of making perfect things.

Updated on June 28, 2022

Comments

  • nikel
    nikel almost 2 years

    We are using Kafka high level consumer , and we are able to successfully consume messages but the zookeeper connections keep expiring and reestablishing.

    I am wondering why are there no heartbeats to keep the connections alive:

    Kafka Consumer Logs
    ====================
     [localhost-startStop-1-SendThread(10.41.105.23:2181)] [ClientCnxn$SendThread] [line : 1096 ]  -  Client session timed out, have not heard from server in 2666ms for sessionid 0x153175bd3860159, closing socket connection and attempting reconnect
    2016-03-08 18:00:06,750 INFO  [localhost-startStop-1-SendThread(10.41.105.23:2181)] [ClientCnxn$SendThread] [line : 975 ]  -  Opening socket connection to server 10.41.105.23/10.41.105.23:2181. Will not attempt to authenticate using SASL (unknown error)
    2016-03-08 18:00:06,823 INFO  [localhost-startStop-1-SendThread(10.41.105.23:2181)] [ClientCnxn$SendThread] [line : 852 ]  -  Socket connection established to 10.41.105.23/10.41.105.23:2181, initiating session
    2016-03-08 18:00:06,892 INFO  [localhost-startStop-1-SendThread(10.41.105.23:2181)] [ClientCnxn$SendThread] [line : 1235 ]  -  Session establishment complete on server 10.41.105.23/10.41.105.23:2181, sessionid = 0x153175bd3860159, negotiated timeout = 4000
    
    
    Zookeeper Logs
    ==================
    [2016-03-08 17:44:37,722] INFO Accepted socket connection from /10.10.113.92:51333 (org.apache.zookeeper.server.NIOServerCnxnFactory)
    [2016-03-08 17:44:37,742] INFO Client attempting to renew session 0x153175bd3860159 at /10.10.113.92:51333 (org.apache.zookeeper.server.ZooKeeperServer)
    [2016-03-08 17:44:37,742] INFO Established session 0x153175bd3860159 with negotiated timeout 4000 for client /10.10.113.92:51333 (org.apache.zookeeper.server.ZooKeeperServer)
    [2016-03-08 17:46:56,000] INFO Expiring session 0x153175bd3860151, timeout of 4000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
    [2016-03-08 17:46:56,001] INFO Processed session termination for sessionid: 0x153175bd3860151 (org.apache.zookeeper.server.PrepRequestProcessor)
    [2016-03-08 17:46:56,011] INFO Closed socket connection for client /10.10.114.183:38324 which had sessionid 0x153175bd3860151 (org.apache.zookeeper.server.NIOServerCnxn)