机房停电,ceph启动出现问题:
[root@node1 my-cluster]# systemctl restart ceph.target
Failed to stop ceph.target: Transaction order is cyclic. See system logs for details.
See system logs and 'systemctl status ceph.target' for details
怎么解决呢?不知道,最后一顿捣鼓,他自己好了。但是并不知道他为什么好了。也什么都没干。
捣鼓的步骤如下:
查看/var/log/ceph/ceph.log说是osd超时,看一下日志报的osd连接的端口对方不存在。
[root@node1 my-cluster]# systemctl restart ceph-osd@0
[root@node1 my-cluster]# systemctl restart ceph-mon@node1
结果都报同一个错误。
是不是重启间隔太短,导致出问题?改下service文件
vim /etc/systemd/system/ceph-mon.target.wants/ceph-mon\@node1.service
把StartLimitInterval改成1min。
其他几个模块类似。
重新试,结果还是报“Transaction order is cyclic”
那就要排查问题了:
tail -f /var/log/message
systemctl restart ceph-osd@0
结果message没报错。
再次尝试。
[root@node1 my-cluster]# systemctl restart ceph.target
Failed to stop ceph.target: Transaction order is cyclic. See system logs for details.
See system logs and 'systemctl status ceph.target' for details
[root@node1 my-cluster]# journalctl |tail
5月 18 19:32:01 node1 CROND[20494]: (root) CMD (. /root/.bashrc;. ~/.bash_profile;. /etc/profile;/usr/bin/python /usr/local/yfs/yfsagent.py >/dev/null 2>&1 &)
5月 18 19:32:02 node1 polkitd[1120]: Registered Authentication Agent for unix-process:20594:832875 (system bus name :1.947 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8)
5月 18 19:32:02 node1 systemd[1]: Found ordering cycle on ceph.target/restart
5月 18 19:32:02 node1 systemd[1]: Found dependency on ceph-osd.target/restart
5月 18 19:32:02 node1 polkitd[1120]: Unregistered Authentication Agent for unix-process:20594:832875 (system bus name :1.947, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8) (disconnected from bus)
5月 18 19:32:02 node1 systemd[1]: Found dependency on ceph-osd@0.service/restart
5月 18 19:32:02 node1 systemd[1]: Found dependency on ceph-mon.target/restart
5月 18 19:32:02 node1 systemd[1]: Found dependency on ceph.target/restart
5月 18 19:32:02 node1 systemd[1]: Unable to break cycle
5月 18 19:32:02 node1 systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Transaction order is cyclic. See system logs for details.
发现启动的顺序中先启动的是osd,那就
[root@node1 my-cluster]# systemctl restart ceph-osd@0.service
发现命令不报错了。
总之是个诡异问题。
建议下次碰类似问题建议调试时用如下方式:
看日志:
journalctl -xe
tail -f /var/log/message
tail -f /var/log/ceph/ceph.log
关于此问题的其他文档:(与我遇到的情况并不相同)
https://tracker.ceph.com/issues/14839
https://github.com/ceph/ceph/pull/15835
https://github.com/ceph/ceph/pull/15051
https://tracker.ceph.com/issues/19910
https://tracker.ceph.com/issues/21035
https://tracker.ceph.com/issues/21477
来源:oschina
链接:https://my.oschina.net/u/4330613/blog/4283597