Embedding a Zookeeper Server
To minimize the number of moving parts in the message delivery system I wanted to embed the Zookeeper server in the application, rather than running a separate ensemble of Zookeeper servers.
The embedded Zookeeper server is encapsulated in an
EmbeddedZookeeper trait that is mixed into the
ZookeeperService class from part 1. Here is
There are two kinds of Zookeeper servers, standalone and replicated. Standalone (aka Single) servers, which are typically used for local development and testing, use
ZookeeperServerMain to start up. Replicated (aka Clustered or Multi-server) servers, which are replicated for high availability, use
QuorumPeerMain. I extend both of these classes to add a
stop() method in a common interface.
A couple other things to mention about
EmbeddedZookeeper: it is self-typed to
Logging to get access to the
log methods. Also there is an abstract
clientPort method that is implemented by
The embedded server is configured and started as part of the
initialize() method in
initialize() a list of Zookeeper server hostnames is passed to
genServerNames() which generates a
ServerNames (see below). Then the Zookeeper server is configured and started.
hosts, a collection of cluster node IDs and
serversVal, which corresponds to the
server.x=[hostname]:nnnnn[:nnnnn] configuration settings for clustered servers (each server needs to know about all of the other servers). If there are multiple elements in
useReplicated will be
true which will tell
configureServer() to configure a replicated server.
configureServer() the first thing that happens is the Zookeeper server data directory is removed and recreated. I found that the Zookeeper server data directory could get corrupted if the server wasn’t shut down cleanly. Instead of trying to maintain the data directory across restarts I decided to just recreate the data directory each time the application started up. The downside is that the server’s database has to be populated on each startup (it synchronizes the data from other servers). In this particular use case it’s ok because there is very little data being stored in Zookeeper (just the Leader Election znode and children). The upside is that no manual intervention is needed to address corrupt data directories.
Replicated Zookeeper servers require a
myid file that identifies the ordinal position of the server in the ensemble. To avoid having to create this file manually I create it here as part of server startup.
Finally the appropriate Zookeeper server object is instantiated and configured with a set of
Properties and a server startup function is returned for use by
startServer(). A reference to the server object is saved in
zkServer for shutdown later.
To start the server a new thread is created and in that thread the server startup function returned by
configureServer() is run. The main application thread waits on a semaphore for the server to start.
One of the limitations of running Zookeeper in replicated mode is that a quorum of servers has to be running in order for the ensemble to respond to requests. A quorum is N/2 + 1 of the servers. In practice this means that there should be an odd number of servers and there needs to be at least three servers in the cluster. If there are three servers in the cluster then two of the servers have to be up and running for the clustering functionality to work. Depending on how you deploy your application it might not be possible to always keep at least N/2 + 1 servers running. If that is the case then embedded Zookeeper won’t be an option for you.
Also, one of the things that I noticed when running embedded Zookeeper is that during deployment as application JVMs are bounced you will get a lot of error noise in the logs coming from Zookeeper server classes. This is expected behavior but it might cause concern with your operations team.
In the next post I will come back to the
ClusterService and fix a problem where multiple nodes might think they are the leader at the same time.