In this paper, based on the ceph version 13.2.6 (7 b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)


The CEPH instance is configured as follows:

Two OSD nodes, each with Intel e5-2670 and 32GB REG ECC.

OSD 0-10 is Seaggate ST6000NM0034 SAS HDD 6TB x 11.

OSD 13-20 uses Intel P3700 NVMe 800GB x 2. Each NVMe is divided into four 200GB LVM volumes to make full use of NVMe performance.

The HBA is Fujitsu PRAID CP400i, and the NIC is Mellanox ConnectX-3 56G.


A check on CEPH today found that an OSD is close to full.

[root@storage02-ib ~]# ceph osd status+----+----------------------------+-------+-------+--------+---------+--------+---------+------------------------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | + - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | | 0  storage02-ib.lobj.eth6.org | 1599G | 3989G | 0 | 0 | 0 | 0 | exists,up | | 1 | storage02-ib.lobj.eth6.org | 1625G | 3963G | 0 | 0 | 0 | 0 | exists,up | | 2 | storage02-ib.lobj.eth6.org | 1666G | 3922G | 0 | 0 | 0 | 0 | exists,up | | 3 |  storage02-ib.lobj.eth6.org | 1817G | 3771G | 0 | 0 | 0 | 0 | exists,up | | 4 | storage02-ib.lobj.eth6.org | 1868G | 3720G | 0 | 0 | 0 | 0 | exists,up | | 5 | storage03-ib.lobj.eth6.org | 1685G | 3903G | 0 | 0 | 0 | 0 | exists,up | | 6 |  storage03-ib.lobj.eth6.org | 1686G | 3902G | 0 | 0 | 0 | 0 | exists,up | | 7 | storage03-ib.lobj.eth6.org | 1153G | 4435G | 0 | 0 | 1 | 0 | exists,up | | 8 | storage03-ib.lobj.eth6.org | 1374G | 4214G | 0 | 0 | 0 | 0 | exists,up | | 9 |  storage03-ib.lobj.eth6.org | 2098G | 3490G | 0 | 0 | 0 | 0 | exists,up | | 10 | storage03-ib.lobj.eth6.org | 1715G | 3873 G | | | | 0 0 0 0 | exists, the up | | | | | storage02-ib.lobj.eth6.org 13 172 G 13.7 G | | | | | 0 0 0 0 Backfillfull, exists, the up | | | | | | storage02-ib.lobj.eth6.org 43.9 G 142 G of 14 | | | 0 0 0 0 | exists, the up | | | 15 Storage02-ib.lobj.eth6.org | | 79.2 G 106 G | | | | 0 0 0 0 | exists, the up | | | | | storage02-ib.lobj.eth6.org 10.1 G 16 176 G | | | | 0 0 0 0 | exists, the up | | | | 17 storage03-ib.lobj.eth6.org 102 G 83.4 G | | | | | 0 0 0 0 | exists, the up | | | 18 Storage03-ib.lobj.eth6.org | | 10.2 G 176 G | | | | 0 0 0 0 | exists, the up | | | | | storage03-ib.lobj.eth6.org 10.1 19 G 176 G | | | | 0 0 0 0 | exists, the up | | | | | storage03-ib.lobj.eth6.org 137 20 G 49.1 G | | | | 0 0 0 0 | exists, the up | +----+----------------------------+-------+-------+--------+---------+--------+---------+------------------------+Copy the code

The ceph osd status command displays that OSD 13 is nearly full.

[root@storage02-ib ~]# ceph -s
  cluster:
    id:     0f7be0a4-2a05-4658-8829-f3d2f62579d2 health: HEALTH_WARN 1 Backfillfull OSD (s) 5 Pool (s) Backfillfull 367931/4527742 objects MISPLACED (8.126%) Services: mon 3 daemons, quorum storage01-ib,storage02-ib,storage03-ib mgr: storage01-ib(active), standbys: storage03-ib, storage02-ib osd: 19 osds: 19 up, 18in; 41 REMgs RGW: 2 Daemons active Data: Pools: 5 Pools, 288 PGS objects: 2.26 M objects, 8.6 TiB usage: 18 TiB used, 43 TiB / 61 TiB avail pgs: 367931/4527742 objects MISPLACED (8.126%) 247 active+clean 40 active+remapped+backfill_wait 1 Active + REMapped + Backfilling IO: Client: 1.0 MiB/s RD, 65 OP /s RD, 0 OP /s WRCopy the code

1 Backfillfull OSD (s). The osd pool does not specify a map relationship with the OSD pool. Therefore, all OSD pools are supported by these pools by default. 5 Pool (s) backfillfull

Objects MISPLACED is also prompted because a previous maintenance replaced a hard drive and is in the process of data recovery.


The OSD node is the NVMe storage that has not yet been removed. The NVMe wants to use the OSD node for other purposes, so it will be removed this time.

If OSD nodes 13-20 of NVMe nodes are removed, CEPH automatically balances data on other OSD nodes. So we just need to mark OSD out first, then delete OSD, and finally remove hardware.


Check whether the capacity of the OSD node is sufficient before removing the node. If too many OSD nodes are removed from the node, the OSD node becomes full and the read/write performance of the cluster deteriorates or the cluster cannot read/write data at all.

The removal will occupy disk I/O and network bandwidth of the system. Therefore, if online services are used, the removal time must be planned to prevent services from being affected.


First, let’s remove OSD 13, 14 as a demonstration.

[root@storage02-ib ~]# ceph osd out 13 14;
marked out osd.13 osd.14 Copy the code


  cluster:
    id:     0f7be0a4-2a05-4658-8829-f3D2F62579D2 Health: HEALTH_WARN 258661/4527742 Objects Misplaced (5.713%) Degraded data Redundancy: Possible values of 22131/4527742 Objects, degraded (0.489%), 3 PGS, Degraded Services: Mon 3 daemons, quorum storage01-ib,storage02-ib,storage03-ib mgr: storage01-ib(active), standbys: storage03-ib, storage02-ib osd: 19 osds: 19 up, 11in; 36 REMgs RGW: 2 Daemons active data: Pools: 5 Pools, 288 PGS objects: 2.26 M objects, 8.6 TiB usage: 18 TiB used, 43 TiB / 61 TiB avail pgs: D 2, 258661/4527742 objects Misplaced (5.713% active+remapped+backfill_wait 5 active+remapped+backfilling 2 active+undersized+degraded+remapped+backfill_wait 1 active+undersized+degraded+remapped+backfilling io: recovery: 90 MiB/s, 22 objects/sCopy the code

You can see a hint of 258661/4527742 objects Misplaced, 22131/4527742 objects Degraded, these are automatically adjusted data.

After the adjustment is complete, we can remove other OSD nodes in turn.


Then, log in to the OSD server and stop the OSD daemon.

[root@storage02-ib ~]# systemctl stop ceph-osd@20
[root@storage02-ib ~]# Copy the code

After all daemons have stopped, run the purge command to remove the OSD from the cluster.

[root@storage02-ib ~]# ceph osd purge 20 --yes-i-really-mean-itCopy the code


If OSD information is also configured in the ceph.conf file, you need to remove it and redistribute the configuration.

Finally, just stop and unplug the hardware.


  • Reference

Docs.ceph.com/docs/master…