Zookeeper connection error

后端 未结 23 965
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-24 05:41

We have a standalone zookeeper setup on a dev machine. It works fine for every other dev machine except this one testdev machine.

We get this error over and over aga

相关标签:
23条回答
  • 2020-12-24 06:14

    This can happen if there are too many open connections.

    Try increasing the maxClientCnxns setting.

    From documentation:

    maxClientCnxns (No Java system property)

    Limits the number of concurrent connections (at the socket level) that a single client, identified by IP address, may make to a single member of the ZooKeeper ensemble. This is used to prevent certain classes of DoS attacks, including file descriptor exhaustion. Setting this to 0 or omitting it entirely removes the limit on concurrent connections.

    You can edit settings in the config file. Most likely it can be found at /etc/zookeeper/conf/zoo.cfg.

    In modern ZooKeeper versions default value is 60. You can increase it by adding the maxClientCnxns=4096 line to the end of the config file.

    0 讨论(0)
  • 2020-12-24 06:14

    Had the same error during setup on a 2 node cluster. I discovered I had mixed up the contents of the myid file versus the server.id=HOST_IP:port entry.

    Essentially, if you have two servers (SERVER1 and SERVER2) for which you have created "myid" files in dataDir for zookeeper as below

    SERVER1 (myid)
    1
    
    SERVER2 (myid)
    2
    

    Ensure the entry in your zoo.cfg file corresponds for each of these i.e server.1 should use SERVER1 hostname and server.2 should use SERVER2 hostname followed by the port as below

    SERVER1 (zoo.cfg)
    ... (other config omitted)
    server.1=SERVER1:2888:3888
    server.2=SERVER2:2888:3888
    
    SERVER2 (zoo.cfg)
    ... (other config omitted)
    server.1=SERVER1:2888:3888
    server.2=SERVER2:2888:3888
    

    Just to make sure, I also deleted the version-* folder in the dataDir then restarted Zookeeper to get it working.

    0 讨论(0)
  • 2020-12-24 06:16

    leave only one entry for your host IP in /etc/hosts file, it resolved.

    0 讨论(0)
  • 2020-12-24 06:17

    I just have the same situation as you and I have just fixed this problem.

    my conf/zoo.cfg just like this:

    server.1=10.194.236.32:2888:3888
    server.2=10.194.236.33:2888:3888
    server.3=10.208.177.15:2888:3888
    server.4=10.210.154.23:2888:3888
    server.5=10.210.154.22:2888:3888
    

    then i set data/myid file content like this:

    1      //at host  10.194.236.32
    2      //at host  10.194.236.33
    3      //at host  10.208.177.15
    4      //at host  10.210.154.23
    5      //at host  10.210.154.22
    

    finally restart zookeeper

    0 讨论(0)
  • 2020-12-24 06:17

    I also get the same error when i started my replicated zk, one of zkClient can not connect to localhost:2181, i checked the log file under apache-zookeeper-3.5.5-bin/logs directory, and found this:

    2019-08-20 11:30:39,763 [myid:5] - WARN [QuorumPeermyid=5(secure=disabled):QuorumCnxManager@677] - Cannot open channel to 3 at election address /xxxx:3888 java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:648) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:705) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:733) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1247) 2019-08-20 11:30:44,768 [myid:5] - WARN [QuorumPeermyid=5(secure=disabled):QuorumCnxManager@677] - Cannot open channel to 4 at election address /xxxxxx:3888 java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:648) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:705) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:733) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1247) 2019-08-20 11:30:44,769 [myid:5] - INFO [QuorumPeermyid=5(secure=disabled):FastLeaderElection@919] - Notification time out: 51200

    that means this zk server can not connect to other servers, and i found this server ping other servers fail, and after remove this server from the replica, the problem is solved.

    hope this will be helpful.

    0 讨论(0)
  • 2020-12-24 06:18

    This can happen despite the ZooKeeper servers being up and running and the socket open and accepting connections, if one or more of the ZooKeeper disks are out of space. This can easily happen if the old ZK snapshot and log files are never cleaned up:

    The ZooKeeper server creates snapshot and log files, but never deletes them. The retention policy of the data and log files is implemented outside of the ZooKeeper server. The server itself only needs the latest complete fuzzy snapshot, all log files following it, and the last log file preceding it. The latter requirement is necessary to include updates which happened after this snapshot was started but went into the existing log file at that time. This is possible because snapshotting and rolling over of logs proceed somewhat independently in ZooKeeper. See the maintenance section in this document for more details on setting a retention policy and maintenance of ZooKeeper storage.

    There is a maintenance job that can be run to clean up old snapshot and log files: See https://zookeeper.apache.org/doc/r3.4.12/zookeeperAdmin.html#sc_maintenance.

    0 讨论(0)
提交回复
热议问题