Rabbitmq集群
Distributed Rabbitmq brokers的实现方式有三种,分别是clustering、federation、shovel。本节围绕clustering(集群)讲述。
- 搭建rabbitmq集群要求:
- 可靠的网络环境;
- 集群中所有机器的Rabbitmq和Erlang版本要一样。
- Rabbitmq_Clustering工作模式:
- Virtual hosts, exchanges, users和permissions会自动镜像到集群的所有节点;
- queues可以只配置在一个节点或者镜像到多个节点;
- 客户端连接到集群的任何一个节点都能看到所有的queues。
搭建Rabbitmq集群
搭建Rabbitmq集群的方法有很多种,参考Ways of Forming a Cluster,在此作者使用env variables来搭建集群。
Rabbitmq是通过ip和port来为客户端提供服务的,所以配置Rabbitmq实例的基本要求就是绑定ip:port(默认为localhost:5672),如果单机部署过mysql、Redis等工具,想必这个原理很好理解了。如果不理解请继续看示例:
单机启动多个实例
# 启动第一个节点
$ RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit1 rabbitmq-server -detached
#启动第二个节点
RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit2 rabbitmq-server -detached
Warning: PID file not written; -detached was passed.
# 此时查看端口状态会发现第二个节点并没有起来!!!
此处报错,查看日志:
$ less /var/log/rabbitmq/rabbit2.log
Error description:
init:do_boot/3
init:start_em/1
rabbit:start_it/1 line 446
rabbit:broker_start/0 line 322
rabbit:start_apps/2 line 542
app_utils:manage_applications/6 line 126
lists:foldl/3 line 1263
rabbit:'-handle_app_error/1-fun-0-'/3 line 638
throw:{could_not_start,rabbitmq_management,
{rabbitmq_management,
{bad_return,
{{rabbit_mgmt_app,start,[normal,[]]},
{'EXIT',
{{could_not_start_listener,
[{port,15672}],
{shutdown,
{failed_to_start_child,ranch_acceptors_sup,
{listen_error,rabbit_web_dispatch_sup_15672,eaddrinuse}}}},
{gen_server,call,
[rabbit_web_dispatch_registry,
{add,rabbit_mgmt,
[{port,15672}],
#Fun<rabbit_web_dispatch.0.82427196>,
[{'_',[],
[{[],[],cowboy_static,
{priv_file,rabbitmq_management,"www/index.html"}},
{[<<"api">>,<<"overview">>],[],rabbit_mgmt_wm_overview,[]},
{[<<"api">>,<<"cluster-name">>],
[],rabbit_mgmt_wm_cluster_name,[]},
{[<<"api">>,<<"nodes">>],[],rabbit_mgmt_wm_nodes,[]},
{[<<"api">>,<<"nodes">>,node],[],rabbit_mgmt_wm_node,[]},
总的来说就是Rabbitmq_management启动失败,查资料后原因如下:web管理插件端口占用,所以还要指定其web插件占用的端口号。
# 更改参数后启动
$ RABBITMQ_NODE_PORT=5673 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15673}]" RABBITMQ_NODENAME=rabbit2 rabbitmq-server -detached
# 查看端口状态
$ netstat -lntp
tcp 0 0 0.0.0.0:15672 0.0.0.0:* LISTEN 10253/beam.smp
tcp 0 0 0.0.0.0:15673 0.0.0.0:* LISTEN 13922/beam.smp
tcp 0 0 127.0.0.1:9797 0.0.0.0:* LISTEN 632/python2
tcp 0 0 0.0.0.0:25672 0.0.0.0:* LISTEN 10253/beam.smp
tcp 0 0 0.0.0.0:25673 0.0.0.0:* LISTEN 13922/beam.smp
tcp6 0 0 :::4369 :::* LISTEN 10150/epmd
tcp6 0 0 :::5672 :::* LISTEN 10253/beam.smp
tcp6 0 0 :::5673 :::* LISTEN 13922/beam.smp
# rabbit1、rabbit2启动成功
# 启动第三个节点
RABBITMQ_NODE_PORT=5674 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15674}]" RABBITMQ_NODENAME=rabbit3 rabbitmq-server -detached
现在三个节点都已启动,状态:
$ netstat -lntp
tcp 0 0 0.0.0.0:4369 0.0.0.0:* LISTEN 10150/epmd
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 773/sshd
tcp 0 0 0.0.0.0:15672 0.0.0.0:* LISTEN 10253/beam.smp
tcp 0 0 0.0.0.0:15673 0.0.0.0:* LISTEN 13922/beam.smp
tcp 0 0 0.0.0.0:15674 0.0.0.0:* LISTEN 14910/beam.smp
tcp 0 0 127.0.0.1:9797 0.0.0.0:* LISTEN 632/python2
tcp 0 0 0.0.0.0:25672 0.0.0.0:* LISTEN 10253/beam.smp
tcp 0 0 0.0.0.0:25673 0.0.0.0:* LISTEN 13922/beam.smp
tcp 0 0 0.0.0.0:25674 0.0.0.0:* LISTEN 14910/beam.smp
tcp6 0 0 :::4369 :::* LISTEN 10150/epmd
tcp6 0 0 :::22 :::* LISTEN 773/sshd
tcp6 0 0 :::5672 :::* LISTEN 10253/beam.smp
tcp6 0 0 :::5673 :::* LISTEN 13922/beam.smp
tcp6 0 0 :::5674 :::* LISTEN 14910/beam.smp
搭建集群
我把rabbit1作为主节点,剩下两个设置为子节点(主节点不动,配置两个子节点即可)。
-
将rabbit2加入集群:
$ rabbitmqctl -n rabbit2 stop_app $ rabbitmqctl -n rabbit2 reset $ rabbitmqctl -n rabbit2 join_cluster rabbit1@`hostname -s` $ rabbitmqctl -n rabbit2 start_app
-
将rabbit3加入集群(同理):
$ rabbitmqctl -n rabbit3 stop_app $ rabbitmqctl -n rabbit3 reset $ rabbitmqctl -n rabbit3 join_cluster rabbit1@`hostname -s` $ rabbitmqctl -n rabbit3 start_app
-
查看集群状态:
$ rabbitmqctl cluster_status -n rabbit1@host3 Cluster status of node rabbit1@host3 ... [{nodes,[{disc,[rabbit1@host3,rabbit2@host3,rabbit3@host3]}]}, {running_nodes,[rabbit3@host3,rabbit2@host3,rabbit1@host3]}, {cluster_name,<<"rabbit1@host3">>}, {partitions,[]}, {alarms,[{rabbit3@host3,[]},{rabbit2@host3,[]},{rabbit1@host3,[]}]}]
-
在UI_Management页面查看集群状态(server_ip:port,在此可以通过15672、15673、15674任何一个端口进行访问):
-
如果想添加新的节点,只需要执行本节操作步骤即可!
-
删除节点:
$ rabbitmqctl forget_cluster_node [--offline] <existing_cluster_member_node> # 测试未成功
-
参考:
双机搭建集群
基本配置项说明
-
文件描述符:https://ro-che.info/articles/2017-03-26-increase-open-files-limit
-
新版 rabbitmq.conf 模板:https://github.com/rabbitmq/rabbitmq-server/blob/v3.7.x/docs/rabbitmq.conf.example
listeners.tcp.default = host_ip:5673 # 默认端口 loopback_users = none # 取消guest用户登录限制
-
旧版 rabbitmq.config 模板:https://github.com/rabbitmq/rabbitmq-server/blob/v3.7.x/docs/rabbitmq.config.example
-
advaced.config 模板:https://github.com/rabbitmq/rabbitmq-server/blob/master/docs/advanced.config.example
-
插件:http://www.rabbitmq.com/plugins.html
- 插件位置:
/usr/lib/rabbitmq/lib/rabbitmq_server-3.7.9/plugins
- 插件管理:http://www.rabbitmq.com/management.html
- 命令:http://www.rabbitmq.com/management-cli.html
- 插件位置:
-
rabbitmq-env.conf 配置:http://www.rabbitmq.com/configure.html#define-environment-variables
集群配置
- 搭建集群的方法:http://www.rabbitmq.com/clustering.html#cluster-formation-options
- 集群搭建要求:
- DNS解析,集群之间通过域名访问
- 配置本地域名解析文件
/etc/hosts
- 端口权限:http://www.rabbitmq.com/clustering.html#ports
- 4639:服务集群自发现和CLI工具使用
- 5672, 5671:客户端使用
- 25672:用于节点间和CLI工具之间的通信
- 35672-35682:CLI使用
- 15672:HTTP——web管理接口
集群
环境说明
本次用两个节点搭建rabbitmq集群:
主机 | 系统 | Rabbitmq-server版本 | 节点 |
---|---|---|---|
host1 | Centos 7.2 | 3.7.9 | node1 |
host2 | Centos 7.2 | 3.7.9 | node2 |
搭建集群
-
启动node1:
$ systemctl start rabbitmq-server
-
启动node2:
# 启动前需要先将node1的erlang_cookie拷贝到node2,保持一致 $ cat /var/lib/rabbitmq/.erlang.cookie $ systemctl start rabbitmq-server
erlang.cookie是erlang实现分布式的必要文件,erlang分布式的每个节点上要保持相同的.erlang.cookie文件,同时保证文件的权限是400。
-
将node2 加入到node1节点,node2需要执行以下操作:
-
reset:目的是清除节点上的历史数据(如果不清除,无法将节点加入到集群)
$ rabbitmqctl stop_app $ rabbitmqctl reset
-
join
$ rabbitmqctl join_cluster rabbit@redis01 Clustering node rabbit@infra01 with rabbit@redis01 $ rabbitmqctl start_app # 查看集群状态 $ rabbitmqctl cluster_status Cluster status of node rabbit@infra01 ... [{nodes,[{disc,[rabbit@infra01,rabbit@redis01]}]}, {running_nodes,[rabbit@redis01,rabbit@infra01]}, {cluster_name,<<"rabbit@redis01">>}, {partitions,[]}, {alarms,[{rabbit@redis01,[]},{rabbit@infra01,[]}]}]
-
-
rabbitmqctl命令:http://www.rabbitmq.com/rabbitmqctl.8.html
-
rabbitmqadmin:https://rabbit-new.chunyu.me/cli/index.html
-
加入开机启动:
systemctl enable rabbitmq-server
爬坑过程:
开始搭建集群的时候报错:
Clustering node rabbit@host2 with rabbit@host1
Error: unable to perform an operation on node 'rabbit@redis01'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit@redis01
DIAGNOSTICS
===========
attempted to contact: [rabbit@host1]
rabbit@host1:
* connected to epmd (port 4369) on host1
* epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
* TCP connection succeeded but Erlang distribution failed
* Authentication failed (rejected by the remote node), please check the Erlang cookie
Current node details:
* node name: 'rabbitmqcli-23827-rabbit@host2'
* effective user's home directory: /var/lib/rabbitmq
* Erlang cookie hash: t9ttNYffM0xwbMi8k2DA4w==
报错原因:node1节点和node2节点的erlang.cookie不一致
解决办法:各个节点统一使用node1节点的erlang.cookie(文档中已说明)
来源:oschina
链接:https://my.oschina.net/u/3497124/blog/1843966