一、Ceph 服务管理
1.1 启用和停止守护进程
# 启动当前节点的所有Ceph服务 [root@ceph01 ~]# systemctl start ceph.target # 停止当前节点的所有Ceph服务 [root@ceph01 ~]# systemctl stop ceph\*.service ceph\*.target # 对远端节点进行操作 -H 192.168.5.93 [root@ceph01 ~]# systemctl -H ceph02 start ceph.target
1.2 查看相关服务
systemctl status ceph-osd.target systemctl status ceph-osd@1.service systemctl status ceph-mds.target systemctl status ceph-mon.target systemctl status ceph-radosgw.target
二、集群扩展
从根本上说,Ceph一直致力于成长从几个节点到几百个,它应该在没有停机的情况下即时扩展。
2.1 节点信息及系统初始化(请按第一节进行初始化配置)
# ceph-deploy节点设置免密登录[cephadmin@ceph01 ~]$ ssh-copy-id cephadmin@ceph04# 以前为新加节点配置[root@ceph04 ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=ceph baseurl=http://mirrors.aliyun.com/ceph/rpm-mimic/el7/x86_64/ gpgcheck=0 [ceph-noarch] name=cephnoarch baseurl=http://mirrors.aliyun.com/ceph/rpm-mimic/el7/noarch/ gpgcheck=0 [root@ceph04 ~]# id cephadmin uid=1001(cephadmin) gid=1001(cephadmin) groups=1001(cephadmin) [root@ceph04 ~]# cat /etc/sudoers.d/cephadmin cephadmin ALL = (root) NOPASSWD:ALL [root@ceph04 ~]# cat /etc/hosts 192.168.5.91 ceph01 192.168.5.92 ceph02 192.168.5.93 ceph03 192.168.5.94 ceph04 [root@ceph04 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 50G 0 disk └─vda1 253:1 0 50G 0 part / vdb 253:16 0 20G 0 disk vdc 253:32 0 20G 0 disk vdd 253:48 0 20G 0 disk
2.2 添加节点和OSD。
目前我们有三个节点,九个 OSD ,现在要加入一个节点,三个 OSD。
[root@ceph04 ~]# yum install ceph ceph-radosgw -y
2.3 ceph-deploy 添加新的OSD到ceph集群后,Ceph集群数据就会开始重新平衡到新的 OSD,过一段时间后,Ceph 集群就变得稳定了。 生产中,就不能这样添加,否则会影响性能 。
[cephadmin@ceph01 my-cluster]$ for dev in /dev/vdb /dev/vdc /dev/vdd; do ceph-deploy disk zap ceph04 $dev; ceph-deploy osd create ceph04 --data $dev; done [cephadmin@ceph01 my-cluster]$ watch ceph -s [cephadmin@ceph01 my-cluster]$ rados df [cephadmin@ceph01 my-cluster]$ ceph df [cephadmin@ceph01 my-cluster]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN 162/1818 objects misplaced (8.911%) Degraded data redundancy: 181/1818 objects degraded (9.956%), 47 pgs degraded, 8 pgs undersized application not enabled on 1 pool(s)
2.4 修改ceph.conf配置文件
# 更新ceph.conf配置文件,增加ceph04的配置信息和public network信息 [cephadmin@ceph01 my-cluster]$ cat ceph.conf [global] fsid = 4d02981a-cd20-4cc9-8390-7013da54b161 mon_initial_members = ceph01, ceph02, ceph03, ceph04 mon_host = 192.168.5.91,192.168.5.92,192.168.5.93,192.168.5.94 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 192.168.5.0/24 # 需要增加此行,否则会报错 [client.rgw.ceph01] rgw_frontends = "civetweb port=80" [client.rgw.ceph02] rgw_frontends = "civetweb port=80" [client.rgw.ceph03] rgw_frontends = "civetweb port=80" # 上传配置文件 [cephadmin@ceph01 my-cluster]$ ceph-deploy --overwrite-conf config push ceph01 ceph02 ceph03 ceph04
2.5 添加Ceph MON
在生产设置中,您应该始终在Ceph集群中具有奇数个监视节点以形成仲裁:
[cephadmin@ceph01 my-cluster]$ ceph-deploy mon add ceph04
如果不增加上面的public network = 192.168.5.0/24配置信息,运行后会报错:
[cephadmin@ceph01 my-cluster]$ ceph-deploy mon add ceph04 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mon add ceph04 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : add [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fe956d3efc8> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] mon : ['ceph04'] [ceph_deploy.cli][INFO ] func : <function mon at 0x7fe956fad398> [ceph_deploy.cli][INFO ] address : None [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph04 [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph04 [ceph04][DEBUG ] connection detected need for sudo [ceph04][DEBUG ] connected to host: ceph04 [ceph04][DEBUG ] detect platform information from remote host [ceph04][DEBUG ] detect machine type [ceph04][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph04 [ceph_deploy.mon][DEBUG ] using mon address by resolving host: 192.168.5.94 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph04 ... [ceph04][DEBUG ] connection detected need for sudo [ceph04][DEBUG ] connected to host: ceph04 [ceph04][DEBUG ] detect platform information from remote host [ceph04][DEBUG ] detect machine type [ceph04][DEBUG ] find the location of an executable [ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.6.1810 Core [ceph04][DEBUG ] determining if provided host has same hostname in remote [ceph04][DEBUG ] get remote short hostname [ceph04][DEBUG ] adding mon to ceph04 [ceph04][DEBUG ] get remote short hostname [ceph04][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph04][DEBUG ] create the mon path if it does not exist [ceph04][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph04/done [ceph04][DEBUG ] create a done file to avoid re-doing the mon deployment [ceph04][DEBUG ] create the init path if it does not exist [ceph04][INFO ] Running command: sudo systemctl enable ceph.target [ceph04][INFO ] Running command: sudo systemctl enable ceph-mon@ceph04 [ceph04][INFO ] Running command: sudo systemctl start ceph-mon@ceph04 [ceph04][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph04.asok mon_status [ceph04][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [ceph04][WARNIN] monitor ceph04 does not exist in monmap [ceph04][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors [ceph04][WARNIN] monitors may not be able to form quorum [ceph04][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph04.asok mon_status [ceph04][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [ceph04][WARNIN] monitor: mon.ceph04, might not be running yet
查看集群状态:
[cephadmin@ceph01 my-cluster]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN application not enabled on 1 pool(s) services: mon: 4 daemons, quorum ceph01,ceph02,ceph03,ceph04 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph03=up:active}, 2 up:standby osd: 12 osds: 10 up, 9 in rgw: 3 daemons active data: pools: 9 pools, 368 pgs objects: 606 objects, 1.3 GiB usage: 20 GiB used, 170 GiB / 190 GiB avail pgs: 368 active+clean io: client: 2.3 KiB/s rd, 0 B/s wr, 2 op/s rd, 1 op/s wr
三、集群的缩减
存储最重要的功能之一系统是它的灵活性。一个好的存储解决方案应该足够灵活,以支持其扩展和减少,而不会导致服务停机。传统存储系统的灵活性有限; 扩大和减少这种系统是一项艰巨的任务。
Ceph是一个绝对灵活的存储系统,支持即时更改存储容量,无论是扩展还是减少。
3.1 删减Ceph OSD
在继续缩小群集大小,或删除OSD节点之前,请确保群集有足够的可用空间来容纳您计划移出的节点上的所有数据。群集应该不是它的全部比例,即OSD中已用磁盘空间的百分比。因此,作为最佳实践,请勿在不考虑对全部比率的影响的情况下移除OSD或OSD节点。Ceph-Ansible不支持缩小集群中的Ceph OSD节点,这必须手动完成。
[cephadmin@ceph01 my-cluster]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.22302 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 up 0 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 up 0 1.00000 -9 0.05576 host ceph04 9 hdd 0.01859 osd.9 up 1.00000 1.00000 10 hdd 0.01859 osd.10 up 1.00000 1.00000 11 hdd 0.01859 osd.11 up 1.00000 1.00000 [cephadmin@ceph01 my-cluster]$ ceph osd out osd.9 marked out osd.9. [cephadmin@ceph01 my-cluster]$ ceph osd out osd.10 marked out osd.10. [cephadmin@ceph01 my-cluster]$ ceph osd out osd.11 marked out osd.11.
此时,Ceph就会通过将PG从OSD中移出到群集内的其他OSD来开始重新平衡群集。您的群集状态将在一段时间内变得不健康。根据删除的OSD数量,在恢复时间完成之前,群集性能可能会有所下降。
[cephadmin@ceph01 my-cluster]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN 159/1818 objects misplaced (8.746%) Degraded data redundancy: 302/1818 objects degraded (16.612%), 114 pgs degraded, 26 pgs undersized application not enabled on 1 pool(s) services: mon: 4 daemons, quorum ceph01,ceph02,ceph03,ceph04 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph03=up:active}, 2 up:standby osd: 12 osds: 10 up, 6 in; 32 remapped pgs rgw: 3 daemons active data: pools: 9 pools, 368 pgs objects: 606 objects, 1.3 GiB usage: 18 GiB used, 153 GiB / 171 GiB avail pgs: 0.272% pgs not active 302/1818 objects degraded (16.612%) 159/1818 objects misplaced (8.746%) 246 active+clean 88 active+recovery_wait+degraded 26 active+recovery_wait+undersized+degraded+remapped 6 active+remapped+backfill_wait 1 active+recovering 1 activating io: recovery: 2.6 MiB/s, 1 objects/s
虽然我们已把 osd.9,osd.10,osd11 从集群中标记 out ,不会参与存储数据,但他们的服务仍然还在运行。
3.2 关闭ceph04上面的所有OSD
[root@ceph04 ~]# systemctl stop ceph-osd.target[cephadmin@ceph01 my-cluster]$ ceph osd treeID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.22302 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 up 0 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 up 0 1.00000 -9 0.05576 host ceph04 9 hdd 0.01859 osd.9 down 0 1.00000 10 hdd 0.01859 osd.10 down 0 1.00000 11 hdd 0.01859 osd.11 down 0 1.00000
3.3 在crush map中删除osd
[cephadmin@ceph01 my-cluster]$ ceph osd crush remove osd.9 removed item id 9 name 'osd.9' from crush map [cephadmin@ceph01 my-cluster]$ ceph osd crush remove osd.10 removed item id 10 name 'osd.10' from crush map [cephadmin@ceph01 my-cluster]$ ceph osd crush remove osd.11 removed item id 11 name 'osd.11' from crush map
3.4 删除OSD身份密钥
[cephadmin@ceph01 my-cluster]$ ceph auth del osd.9 updated [cephadmin@ceph01 my-cluster]$ ceph auth del osd.10 updated [cephadmin@ceph01 my-cluster]$ ceph auth del osd.11 updated
3.5 删除OSD
[cephadmin@ceph01 my-cluster]$ ceph osd rm osd.9 removed osd.9 [cephadmin@ceph01 my-cluster]$ ceph osd rm osd.10 removed osd.10 [cephadmin@ceph01 my-cluster]$ ceph osd rm osd.11 removed osd.11
3.6 从 Crush map 中删除此节点的痕迹
[cephadmin@ceph01 my-cluster]$ ceph osd crush remove ceph04 removed item id -9 name 'ceph04' from crush map
3.7 删除Ceph MON
[cephadmin@ceph01 my-cluster]$ ceph mon stat e2: 4 mons at {ceph01=192.168.5.91:6789/0,ceph02=192.168.5.92:6789/0,ceph03=192.168.5.93:6789/0,ceph04=192.168.5.94:6789/0}, election epoch 32, leader 0 ceph01, quorum 0,1,2,3 ceph01,ceph02,ceph03,ceph04 # 停止ceph04的mon服务 [cephadmin@ceph01 my-cluster]$ sudo systemctl -H ceph04 stop ceph-mon.target # 删除mon节点 [cephadmin@ceph01 my-cluster]$ ceph mon remove ceph04 removing mon.ceph04 at 192.168.5.94:6789/0, there will be 3 monitors # 查看 mon是否从法定人数里面删除 [cephadmin@ceph01 my-cluster]$ ceph quorum_status --format json-pretty { "election_epoch": 42, "quorum": [ 0, 1, 2 ], "quorum_names": [ "ceph01", "ceph02", "ceph03" ], "quorum_leader_name": "ceph01", "monmap": { "epoch": 3, "fsid": "4d02981a-cd20-4cc9-8390-7013da54b161", "modified": "2020-02-17 14:20:57.664427", "created": "2020-02-02 21:00:45.936041", "features": { "persistent": [ "kraken", "luminous", "mimic", "osdmap-prune" ], "optional": [] }, "mons": [ { "rank": 0, "name": "ceph01", "addr": "192.168.5.91:6789/0", "public_addr": "192.168.5.91:6789/0" }, { "rank": 1, "name": "ceph02", "addr": "192.168.5.92:6789/0", "public_addr": "192.168.5.92:6789/0" }, { "rank": 2, "name": "ceph03", "addr": "192.168.5.93:6789/0", "public_addr": "192.168.5.93:6789/0" } ] } } # 在ceph04删除数据,如果重要请备份 [root@ceph04 ~]# rm -rf /var/lib/ceph/mon/ceph-ceph04 # 更新并推送配置文件 [cephadmin@ceph01 my-cluster]$ cat ceph.conf [global] fsid = 4d02981a-cd20-4cc9-8390-7013da54b161 mon_initial_members = ceph01, ceph02, ceph03 mon_host = 192.168.5.91,192.168.5.92,192.168.5.93 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 192.168.5.0/24 [client.rgw.ceph01] rgw_frontends = "civetweb port=80" [client.rgw.ceph02] rgw_frontends = "civetweb port=80" [client.rgw.ceph03] rgw_frontends = "civetweb port=80" [cephadmin@ceph01 my-cluster]$ ceph-deploy --overwrite-conf config push ceph01 ceph02 ceph03 ceph04
四、故障磁盘修复
故障磁盘信息
[cephadmin@ceph01 ~]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.16727 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 down 0 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 down 0 1.00000
4.1 将故障磁盘标记为out
[cephadmin@ceph01 ~]$ ceph osd out osd.1 [cephadmin@ceph01 ~]$ ceph osd out osd.8
4.2 从 Ceph crush map 中删除故障磁盘 OSD
[cephadmin@ceph01 ~]$ ceph osd crush rm osd.1 [cephadmin@ceph01 ~]$ ceph osd crush rm osd.8
4.3 删除 OSD 的 Ceph 身份验证密钥
[cephadmin@ceph01 ~]$ ceph auth del osd.1 [cephadmin@ceph01 ~]$ ceph auth del osd.8
4.4 从集群中删除OSD
[cephadmin@ceph01 ~]$ ceph osd rm osd.1 [cephadmin@ceph01 ~]$ ceph osd rm osd.8
4.5 卸载故障节点挂载的磁盘
[root@ceph02 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 50G 2.6G 48G 6% / devtmpfs 2.0G 0 2.0G 0% /dev tmpfs 2.0G 0 2.0G 0% /dev/shm tmpfs 2.0G 190M 1.8G 10% /run tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup tmpfs 2.0G 52K 2.0G 1% /var/lib/ceph/osd/ceph-1 tmpfs 2.0G 52K 2.0G 1% /var/lib/ceph/osd/ceph-4 tmpfs 2.0G 52K 2.0G 1% /var/lib/ceph/osd/ceph-7 tmpfs 396M 0 396M 0% /run/user/0 [root@ceph02 ~]# umount /var/lib/ceph/osd/ceph-1
4.6 如果是故障磁盘的话,需要如下操作,若更换故障磁盘后可直接添加入集群省略此步,将ceph-osd.1和ceph-osd.8移除lvm
[root@ceph03 ~]# ll /var/lib/ceph/osd/ceph-8/block lrwxrwxrwx 1 ceph ceph 93 Feb 14 09:45 /var/lib/ceph/osd/ceph-8/block -> /dev/ceph-f0f390f2-d217-47cf-b882-e212afde9cd7/osd-block-ba913d26-e67f-4bba-8efc-6c351ccaf0f8 [root@ceph03 ~]# [root@ceph03 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 50G 0 disk └─vda1 253:1 0 50G 0 part / vdb 253:16 0 20G 0 disk └─ceph--fcfa2170--d24f--4525--99f9--b88ed12d1de5-osd--block--dab88638--d753--4e01--817b--283ba3f0666b 252:1 0 19G 0 lvm vdc 253:32 0 20G 0 disk └─ceph--7e0279e5--47bc--4940--a71c--2fd23f8f046c-osd--block--1a36b1a3--deee--40ab--868c--bd735c9b4e26 252:2 0 19G 0 lvm vdd 253:48 0 20G 0 disk └─ceph--f0f390f2--d217--47cf--b882--e212afde9cd7-osd--block--ba913d26--e67f--4bba--8efc--6c351ccaf0f8 252:0 0 19G 0 lvm [root@ceph03 ~]# dmsetup remove ceph--f0f390f2--d217--47cf--b882--e212afde9cd7-osd--block--ba913d26--e67f--4bba--8efc--6c351ccaf0f8 [root@ceph03 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 50G 0 disk └─vda1 253:1 0 50G 0 part / vdb 253:16 0 20G 0 disk └─ceph--fcfa2170--d24f--4525--99f9--b88ed12d1de5-osd--block--dab88638--d753--4e01--817b--283ba3f0666b 252:1 0 19G 0 lvm vdc 253:32 0 20G 0 disk └─ceph--7e0279e5--47bc--4940--a71c--2fd23f8f046c-osd--block--1a36b1a3--deee--40ab--868c--bd735c9b4e26 252:2 0 19G 0 lvm vdd 253:48 0 20G 0 disk
4.7 擦除磁盘中的超级块信息
[root@ceph03 ~]# wipefs -af /dev/vdd /dev/vdd: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
4.8 清除分区
[cephadmin@ceph01 my-cluster]$ ceph-deploy disk zap ceph03 /dev/vdd
4.9 将磁盘添加到集群中
[cephadmin@ceph01 my-cluster]$ ceph-deploy osd create ceph03 --data /dev/vdd
4.10 查看状态
[cephadmin@ceph01 ~]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.16727 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 up 1.00000 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 up 1.00000 1.00000 [cephadmin@ceph01 ~]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN 61/1818 objects misplaced (3.355%) Degraded data redundancy: 94/1818 objects degraded (5.171%), 37 pgs degraded application not enabled on 1 pool(s) services: mon: 3 daemons, quorum ceph01,ceph02,ceph03 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph03=up:active}, 2 up:standby osd: 9 osds: 9 up, 8 in; 8 remapped pgs rgw: 3 daemons active data: pools: 9 pools, 368 pgs objects: 606 objects, 1.3 GiB usage: 19 GiB used, 152 GiB / 171 GiB avail pgs: 94/1818 objects degraded (5.171%) 61/1818 objects misplaced (3.355%) 328 active+clean 29 active+recovery_wait+degraded 8 active+recovery_wait+undersized+degraded+remapped 2 active+remapped+backfill_wait 1 active+recovering io: recovery: 16 MiB/s, 6 objects/s
五、Ceph 集群维护
5.1 Ceph flag
作为Ceph存储管理员,维护您的Ceph集群将是您的首要任务之一。Ceph是一个分布式系统,旨在从数十个OSD扩展到数千个。维护Ceph集群所需的关键之一是管理其OSD。为了更好地理解对这些命令的需求,我们假设您要在生产Ceph集群中添加新节点。一种方法是简单地将具有多个磁盘的新节点添加到Ceph集群,并且集群将开始回填并将数据混洗到新节点上。这适用于 测试集群 。
然而,当涉及到 生产系统 中,你应该使用 noin,nobackfill,等等。这样做是为了在新节点进入时集群不会立即启动回填过程。然后,您可以在非高峰时段取消设置这些标志,并且集群将花时间重新平衡:
# 设置flag ceph osd set <flag_name> ceph osd set noout ceph osd set nodown ceph osd set norecover # 取消flag ceph osd unset <flag_name> ceph osd unset noout ceph osd unset nodown ceph osd unset norecover
解释:
noup # 防止 osd 进入 up 状态,标记 osd 进程未启动,一般用于新添加 osd nodown # 防止 osd 进入 down 状态,一般用在检查 osd 进程时,而导致 osd down ,发生数据迁移。 noout # 防止 osd 进入 out 状态,down状态的osd 300s 后会自动被标记未 out,此时,数据就会发生迁移。noout标记后,如果 osd down , 该 osd 的 pg 会切换到副本osd 上 noin # 防止 osd 加入 ceph 集群,一般用在新添加 osd 后,又不想马上加入集群,导致数据迁移。 nobackfill # 防止集群进行数据回填操作,Ceph 集群故障会触发 backfill norebalance # 防止数据平衡操作,Ceph 集群在扩容时会触发 rebalance 操作。一般和nobackfill,norecover 一起使用,用于防止数据发生数据迁移等操作。 norecover # 防止数据发生恢复操作。 noscrub # 防止集群清洗操作,在高负载、recovery, backfilling, rebalancing等期间,为了保证集群性能,可以和 nodeep-scrub 一起设置。 nodeepscrub # 防止集群进行深度清洗操作。因为会阻塞读写操作,影响性能。一般不要长时间设置该值,否则,一旦取消该值,则会由大量的pg进行深度清洗。
5.2 节流回填和恢复
如果要在生产峰值中添加新的OSD节点,又希望对客户端 IO 中产生的影响最小,这时就可以借助以下命令限制回填和恢复。
设置 osd_max_backfills = 1 选项以限制回填线程。可以在 ceph.conf [osd] 部分中添加它,也可以使用以下命令动态设置。
ceph tell osd.* injectargs '--osd_max_backfills 1'
设置 osd_recovery_max_active = 1 选项以限制恢复线程。您可以在 ceph.conf[osd] 部分中添加它,也可以使用以下命令动态设置它:
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
设置 osd_recovery_op_priority = 1 选项以降低恢复优先级。您可以在 ceph.conf[osd] 部分中添加它,也可以使用以下命令动态设置它:
ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
5.3 OSD 和 PG 修复
ceph osd repair # 这将在指定的OSD上执行修复。 ceph pg repair # 这将在指定的PG上执行修复。 请谨慎 使用此命令; 根据您的群集状态,如果未仔细使用,此命令可能会影响用户数据。 ceph pg scrub # 这将在指定的PG上执行清理。 ceph deep-scrub # 这会对指定的PG执行深度清理。
来源:https://www.cnblogs.com/cyleon/p/12318481.html