Zookeeper: Connection request from old client will be dropped if server is in r-o mode

11,297

We also experienced several issues with Storm 0.9 and Zookeeper 3.4.X, even though not exactly the one you describe.

Storm mailing list are also reporting such incompatibility issues:

https://mail.google.com/mail/u/0/#search/label%3Astorm+zookeeper+3.4/144313a45ba069b5 https://mail.google.com/mail/u/0/#search/label%3Astorm+zookeeper+3.4/1447d95d10ce7582

This later one is pointing us to this Storm pull request, which should hopefully let us use ZK 3.4.X with future versions of Storm when it will be released:

https://github.com/apache/incubator-storm/pull/29

Until then, I would recommend downgrading ZK to 3.3.6 (you may install a specific separate instance of ZK for Storm if you absolutely need ZK 3.4.X for another system). You could also clone the Storm code and merge that pull request locally or compile the latest version of the trunk, but that's a bit adventurous and more tiresome than just waiting for those nice folks to just deliver a new release for us :)

Share:
11,297

Related videos on Youtube

Vishal
Author by

Vishal

Updated on June 04, 2022

Comments

  • Vishal
    Vishal about 2 years

    storm version: 0.82

    zookeeper version: 3.4.5.

    We have a small storm cluster (1 nimbus and 3 supervisors), so using just 1 zookeeper instance that's co-located with storm nimbus.

    Infrequently we start getting the following errors in the zookeeper logs and our storm cluster comes to a standstill.

    2014-04-05 13:27:32,885 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFact
    ory@197] - Accepted socket connection from /10.0.1.183:56121
    2014-04-05 13:27:32,886 [myid:] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@7
    93] - Connection request from old client /10.0.1.183:56121; will be dropped if server is in r-o mode
    
    2014-04-05 13:27:32,886 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@8
    32] - Client attempting to renew session 0x1452dd02834002e at /10.0.1.183:56121
    2014-04-05 13:27:32,886 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@5
    95] - Established session 0x1452dd02834002e with negotiated timeout 40000 for client /10.0.1.183:561
    21
    

    On the storm end we start seeing the following in supervisor and worker logs:

    2014-04-05 11:37:29 ConnectionStateManager [WARN] There are no ConnectionStateListeners registered.
    2014-04-05 11:37:29 cluster [WARN] Received event :disconnected::none: with disconnected Zookeeper.
    2014-04-05 11:37:31 ClientCnxn [WARN] Session 0x1452dd028340015 for server null, unexpected error,
    losing socket connection and attempting reconnect
    java.net.ConnectException: Connection refused
            at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
            at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
            at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
    2014-04-05 11:37:42 CuratorFrameworkImpl [ERROR] Background operation retry gave up
    org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
            at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
            at com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(Curat
    rFrameworkImpl.java:380)
            at com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl
    java:49)
            at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:617)
            at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
    

    Do we need to downgrade zookeeper to 3.3.3 or is there a known issue/config that we're missing?

  • Vishal
    Vishal about 10 years
    Thanks Svend. What's your experience and do you recommend upgrading storm to 0.9x? We can't afford an unstable release so selected 0.82 to start with.
  • Svend
    Svend about 10 years
    I've not had any problems so far with 0.9.0.1, so I'd recommend giving this one a try. I heard some people are using 0.9.1, but I had several issues with it, among others related to the migration that happened recently between Netty and 0MQ for internal tuple communication. I understand should be fixed in next release as well, let's wait and see... Hope this helps.
  • Vishal
    Vishal about 10 years
    Downgraded our production ZK to 3.3.6 as suggested. Hoping things will be better. Trying out storm 0.9.0.1 in our test cluster.
  • Vishal
    Vishal about 10 years
    zookeeper transaction logs have gone wild and expanding faster than the universe. didn't notice this rate of growth before. did you do anything specific to manage this outside of cron?
  • Svend
    Svend about 10 years
    I don't think so, I did not touch that part too much but I remember a colleague mentioning that script to install in the cron github.com/reddit/zookeeper.dsc. I have no other info regarding that.
  • tszming
    tszming about 10 years
    @Vishal, is the above problems solved after downgraded to 3.3.6?
  • Vishal
    Vishal about 10 years
    No, after several days of testing and the main problem not being resolved we upgraded back to 3.4.5 so that we don't have transactions logs size issues anymore. The main problem turned out to be unrelated to zookeeper.
  • mattspain
    mattspain over 4 years
    did you seriously paste gmail URLs?