
A RabbitMQ broker is a logical grouping of one or several Erlang nodes, each running the RabbitMQ application and sharing users, virtual hosts, queues, exchanges, etc. Sometimes we refer to the collection of nodes as a cluster.
All data/state required for the operation of a RabbitMQ broker is replicated across all nodes, for reliability and scaling, with full ACID properties. An exception to this are message queues, which currently only reside on the node that created them, though they are visible and reachable from all nodes. Future releases of RabbitMQ will introduce migration and replication of message queues.
The easiest way to set up a cluster is by auto configuration using a default cluster config file. See the clustering transcripts for an example.
The composition of a cluster can be altered dynamically. All RabbitMQ brokers start out as running on a single node. These nodes can be joined into clusters, and subequently turned back into individual brokers again.
RabbitMQ brokers tolerate the failure of individual nodes. Nodes can be started and stopped at will.
The list of currently active cluster connection points is
returned in the known_hosts field of AMQP's
connection.open_ok method, as a comma-separated list
of addresses where each address is an IP address or a DNS
name, optionally followed by a colon and a port number.
Nodes in a cluster perform some basic load balancing by
responding to client connection attempts with AMQP's
connection.redirect method as appropriate,
unless the client suppressed redirects by setting the
insist flag in the connection.open
method.
The following is a transcript of setting up and manipulating
a RabbitMQ cluster across three machines -
rabbit1, rabbit2,
rabbit3, with two of the machines replicating
data on ram and disk, and the other replicating data in ram
only.
We assume that the user is logged into all three machines, that RabbitMQ has been installed on the machines, and that the rabbitmq-server and rabbitmqctl scripts are in the user's PATH.
Erlang nodes use a cookie to determine whether they are allowed to communicate with each other - for two nodes to be able to communicate they must have the same cookie.
The cookie is just a string of alphanumeric characters. It can be as long or short as you like.
Erlang will automatically create a random cookie file when the RabbitMQ server starts up. This will be typically located in /var/lib/rabbitmq/.erlang.cookie on Unix systems and C:\Documents and Settings\Current User\Application Data\RabbitMQ\.erlang.cookie on Windows systems. The easiest way to proceed is to allow one node to create the file, and then copy it to all the other nodes in the cluster.
As an alternative, you can insert the option "-setcookie
cookie" in the erl call in the
rabbitmq-server and rabbitmqctl
scripts.
Clusters are set up by re-configuring existing RabbitMQ nodes into a cluster configuration. Hence the first step is to start RabbitMQ on all nodes in the normal way:
rabbit1$ rabbitmq-server -detached rabbit2$ rabbitmq-server -detached rabbit3$ rabbitmq-server -detached
This creates three independent RabbitMQ brokers, one on each node, as confirmed by the status command:
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit1]},
{running_nodes,[rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2]},
{running_nodes,[rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3]}]
done.
In order to link up our three nodes in a cluster, we tell
two of the nodes, say rabbit@rabbit2 and
rabbit@rabbit3, to join the cluster of the
third, say rabbit@rabbit1.
We first join rabbit@rabbit2 as a ram node in
a cluster withh rabbit@rabbit1 in a
cluster. To do that, on rabbit@rabbit2 we
stop the RabbitMQ application, reset the node, join the
rabbit@rabbit1 cluster, and restart the
RabbitMQ application.
rabbit2$ rabbitmqctl stop_app Stopping node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl reset Resetting node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl cluster rabbit@rabbit1 Clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...done. rabbit2$ rabbitmqctl start_app Starting node rabbit@rabbit2 ...done.
We can see that the two nodes are joined in a cluster by running the status command on either of the nodes:
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1]},
{running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1]},
{running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]
done.
Now we join rabbit@rabbit3 as a disk node to
the same cluster. The steps are identical to the ones
above, except that we list rabbit@rabbit3 as
a node in the cluster command in order to turn it
into a disk rather than ram node.
rabbit3$ rabbitmqctl stop_app Stopping node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl reset Resetting node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl cluster rabbit@rabbit1 rabbit@rabbit3 Clustering node rabbit@rabbit3 with [rabbit@rabbit1, rabbit@rabbit3] ...done. rabbit3$ rabbitmqctl start_app Starting node rabbit@rabbit3 ...done.
When joining a cluster it is ok to specify nodes which are currently down; it is sufficient for one node to be up for the command to succeed.
We can see that the three nodes are joined in a cluster by running the status command on any of the nodes:
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit2,rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit1,rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]
done.
By following the above steps we can add new nodes to the cluster at any time, while the cluster is running.
We can change the type of a node from ram to disk and vice
versa. Say we wanted to reverse the types of
rabbit@rabbit2 and
rabbit@rabbit3, turning the former from a ram
node into a disk node and the latter from a disk node into
a ram node. To do that we simply stop the RabbitMQ
application, change the type with an appropriate
cluster command, and restart the application.
rabbit2$ rabbitmqctl stop_app Stopping node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl cluster rabbit@rabbit1 rabbit@rabbit2 Clustering node rabbit@rabbit2 with [rabbit@rabbit1, rabbit@rabbit2] ...done. rabbit2$ rabbitmqctl start_app Starting node rabbit@rabbit2 ...done. rabbit3$ rabbitmqctl stop_app Stopping node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl cluster rabbit@rabbit1 rabbit@rabbit2 Clustering node rabbit@rabbit3 with [rabbit@rabbit1, rabbit@rabbit2] ...done. rabbit3$ rabbitmqctl start_app Starting node rabbit@rabbit3 ...done.
The significance of specifying both
rabbit@rabbit1 and
rabbit@rabbit2 as the cluster nodes for
rabbit@rabbit3 is that in case of failure of
either of them, rabbit@rabbit3 can still
connect to the cluster when it starts, and operate
normally. This is only important for ram nodes; disk nodes
automatically keep track of the cluster
configuration.
Nodes that have been joined to a cluster can be stopped at any time. It is also ok for them to crash. In both cases the rest of the cluster continues operating unaffected, and the nodes automatically "catch up" with the other cluster nodes when they start up again.
We shut down the nodes rabbit@rabbit1 and
rabbit@rabbit3 and check on the cluster
status at each step:
rabbit1$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit1 ...done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit2,rabbit@rabbit3]}]
done.
rabbit3$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit3 ...done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit2]}]
done.
Now we start the nodes again, checking on the cluster status as we go along:
rabbit1$ rabbitmq-server -detached
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit2,rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]
done.
rabbit3$ rabbitmq-server -detached
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit2,rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit1,rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]
done.
There are some important caveats:
Nodes need to be removed explicitly from a cluster when they are no longer meant to be part of it. This is particularly important in case of disk nodes since, as noted above, certain operations require all disk nodes to be up.
We first remove rabbit@rabbit3 from the
cluster, returning it to independent operation. To do
that, on rabbit@rabbit3 we stop the RabbitMQ
application, reset the node, and restart the RabbitMQ
application.
rabbit3$ rabbitmqctl stop_app Stopping node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl reset Resetting node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl start_app Starting node rabbit@rabbit3 ...done.
Note that it would have been equally valid to list
rabbit@rabbit3 as a node.
Running the status command on the nodes confirms
that rabbit@rabbit3 now is no longer part of
the cluster and operates independently:
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1]},
{running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2,rabbit@rabbit1]},
{running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3]}]
done.
Now we remove rabbit@rabbit1 from the
cluster. The steps are identical to the ones above.
rabbit1$ rabbitmqctl stop_app Stopping node rabbit@rabbit1 ...done. rabbit1$ rabbitmqctl reset Resetting node rabbit@rabbit1 ...done. rabbit1$ rabbitmqctl start_app Starting node rabbit@rabbit1 ...done.
The status command now shows all three nodes operating as independent RabbitMQ brokers:
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit1]},
{running_nodes,[rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit2]},
{running_nodes,[rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3]}]
done.
Note that rabbit@rabbit2 retains the residual
state of the cluster, whereas rabbit@rabbit1
and rabbit@rabbit3 are freshly initialised
RabbitMQ brokers. If we want to re-initialise
rabbit@rabbit2 we follow the same steps as
for the other nodes:
rabbit2$ rabbitmqctl stop_app Stopping node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl reset Resetting node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl start_app Starting node rabbit@rabbit2 ...done.
Instead of configuring clusters "on the fly" using the
cluster command, clusters can also be set up
via a default cluster configuration file, the location of
which is determined by the startup scripts; see the installation guide for
details. The file should contain a list of cluster nodes.
Listing cluster nodes in that file has the same effect as
using the cluster command. However, the latter
takes precedence over the former, i.e. the default cluster
configuration file is ignored subsequent to any successful
invocation of the cluster command, until the
node is reset.
A common use of the default cluster configuration file is to automatically configure nodes to join a common cluster. For this purpose the same configuration file can be installed on all nodes, containing a list of potential disk nodes for the cluster.
Say we want to join our three separate nodes of our
running example back into a single cluster, with
rabbit@rabbit1 and
rabbit@rabbit2 being the disk nodes of the
cluster. First we reset and stop all nodes - NB: this step
would not be necessary if this was a fresh installation of
RabbitMQ.
rabbit1$ rabbitmqctl stop_app Stopping node rabbit@rabbit1 ...done. rabbit1$ rabbitmqctl reset Resetting node rabbit@rabbit1 ...done. rabbit1$ rabbitmqctl stop Stopping and halting node rabbit@rabbit1 ...done. rabbit2$ rabbitmqctl stop_app Stopping node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl reset Resetting node rabbit@rabbit2 ...done. rabbit2$ rabbitmqctl stop Stopping and halting node rabbit@rabbit2 ...done. rabbit3$ rabbitmqctl stop_app Stopping node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl reset Resetting node rabbit@rabbit3 ...done. rabbit3$ rabbitmqctl stop Stopping and halting node rabbit@rabbit3 ...done.
Now we create a configuration file containing the line
[rabbit@rabbit1, rabbit@rabbit2].
We copy this file onto all machines and install it in the
location as defined in the start up files (see the installation guide). For example,
on a Unix system the file would typically have the path
/etc/default/rabbitmq_cluster.config. Now we
simply start the nodes.
rabbit1$ rabbitmq-server -detached rabbit2$ rabbitmq-server -detached rabbit3$ rabbitmq-server -detached
We can see that the three nodes are joined in a cluster by running the status command on any of the nodes:
rabbit1$ rabbitmqctl status
Status of node rabbit@rabbit1 ...
[...,
{nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit2,rabbit@rabbit1]}]
done.
rabbit2$ rabbitmqctl status
Status of node rabbit@rabbit2 ...
[...,
{nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit3,rabbit@rabbit1,rabbit@rabbit2]}]
done.
rabbit3$ rabbitmqctl status
Status of node rabbit@rabbit3 ...
[...,
{nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]},
{running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]
done.
Under some circumstances it can be useful to run a cluster of RabbitMQ nodes on a single machine. In particular, this is necessary in order to get the full benefit of the CPUs on a multi-core machine. The two main requirements for running more than one node on a single machine are that each node should have a unique name and bind to a unique port / IP address combination.
The easiest way to start a cluster on a single machine is
to use the script rabbitmq-multi (rabbitmq-multi.bat
on Windows). You can invoke this as:
$ rabbitmq-multi start_all count
This will start count nodes with unique names, listening on all IP addresses and on sequential ports starting from 5672. You can then stop all nodes as follows:
$ rabbitmq-multi stop_all
Please note that you still need to put the nodes into a cluster by auto-configuration or manually arranging your nodes into a cluster. This may be as simple as creating a cluster configuration file containing this:
[rabbit@machine].
You can also start multiple nodes on the same host
manually by repeated invocation of
rabbitmq-server (
rabbitmq-server.bat on Windows). You must
ensure that for each invocation you set the environment
variables RABBITMQ_NODENAME,
RABBITMQ_NODE_IP_ADDRESS and
RABBITMQ_NODE_PORT to suitable values
("0.0.0.0" means "all IP addresses").