**Author: Xu Kai **
This article mainly introduces how to build a Ceph distributed system with three virtual machines, the steps are simple but accurate. The environment cleaning section can solve most of the problems of unsuccessful deployment. The last section introduces common Ceph operations, hoping to give some help to students who have just set up the environment.
Three hosts with CentOS 7 installed, each host has three disks (virtual machine disks must be greater than 100G), the detailed information is as follows:
[ root@ceph-1~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511(Core)[root@ceph-1~]# lsblk
NAME MAJ:MINRM SIZE RO TYPE MOUNTPOINT
sda 8:00 128G 0 disk
├─sda1 8:10 500M 0 part /boot
└─sda2 8:20127.5G 0 part
├─centos-root 253:00 50G 0 lvm /
├─centos-swap 253:10 2G 0 lvm [SWAP]
└─centos-home 253:2075.5G 0 lvm /home
sdb 8:160 2T 0 disk
sdc 8:320 2T 0 disk
sdd 8:480 2T 0 disk
sr0 11:01 1024M 0 rom
[ root@ceph-1~]# cat /etc/hosts
..192.168.57.222 ceph-1192.168.57.223ceph-2192.168.57.224 ceph-3
The cluster configuration is as follows:
Host | IP | Function |
---|---|---|
ceph-1 | 192.168.57.222 | deploy、mon1、osd3 |
ceph-2 | 192.168.57.223 | mon1、 osd3 |
ceph-3 | 192.168.57.224 | mon1 、osd3 |
If the previous deployment fails, there is no need to delete the ceph client or rebuild the virtual machine. You only need to execute the following instructions on each node to clean the environment to the state when the ceph client was just installed! It is strongly recommended to clean up the environment before building on the old cluster, otherwise various abnormal situations will occur.
psaux|grep ceph |awk '{print $2}'|xargs kill -9
ps -ef|grep ceph
# Ensure that all ceph processes have been closed at this time! ! ! If it is not closed, execute it several times.
umount /var/lib/ceph/osd/*
rm -rf /var/lib/ceph/osd/*
rm -rf /var/lib/ceph/mon/*
rm -rf /var/lib/ceph/mds/*
rm -rf /var/lib/ceph/bootstrap-mds/*
rm -rf /var/lib/ceph/bootstrap-osd/*
rm -rf /var/lib/ceph/bootstrap-rgw/*
rm -rf /var/lib/ceph/tmp/*
rm -rf /etc/ceph/*
rm -rf /var/run/ceph/*
The following instructions need to be executed on each host
yumclean all
rm -rf /etc/yum.repos.d/*.repo
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
sed -i '/aliyuncs/d' /etc/yum.repos.d/CentOS-Base.repo
sed -i '/aliyuncs/d' /etc/yum.repos.d/epel.repo
sed -i 's/$releasever/7/g' /etc/yum.repos.d/CentOS-Base.repo
Increase the source of ceph:
vim/etc/yum.repos.d/ceph.repo
Add the following:
[ ceph]
name=ceph
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/x86_64/
gpgcheck=0[ceph-noarch]
name=cephnoarch
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/noarch/
gpgcheck=0
Install the ceph client:
yummakecache
yum install ceph ceph-radosgw rdate -y
Close selinux&firewalld
sed-i 's/SELINUX=.*/SELINUX=disabled/'/etc/selinux/config
setenforce 0
systemctl stop firewalld
systemctl disable firewalld
Synchronize the time of each node:
yum-y install rdate
rdate -s time-a.nist.gov
echo rdate -s time-a.nist.gov >>/etc/rc.d/rc.local
chmod +x /etc/rc.d/rc.local
Install ceph-deploy on the deployment node (ceph-1). The deployment node below refers to ceph-1:
[ root@ceph-1~]# yum -y install ceph-deploy
[ root@ceph-1~]# ceph-deploy --version
1.5.34[ root@ceph-1~]# ceph -v
ceph version 10.2.2(45107e21c568dd033c2f0a3107dec8f0b0e58374)
Create a deployment directory on the deployment node and start deployment:
[ root@ceph-1~]# cd
[ root@ceph-1~]# mkdir cluster
[ root@ceph-1~]# cd cluster/[root@ceph-1 cluster]# ceph-deploy newceph-1 ceph-2 ceph-3
If there is no ssh-copy-id to each node before, you need to enter the password, the process log is as follows:
[ ceph_deploy.conf][DEBUG] found configuration file at:/root/.cephdeploy.conf
[ ceph_deploy.cli][INFO ]Invoked(1.5.34):/usr/bin/ceph-deploy newceph-1 ceph-2 ceph-3[ceph_deploy.cli][INFO ] ceph-deploy options:[ceph_deploy.cli][INFO ] username : None
[ ceph_deploy.cli][INFO ] func :<functionnewat0x7f91781f96e0>[ceph_deploy.cli][INFO ] verbose : False
[ ceph_deploy.cli][INFO ] overwrite_conf : False
[ ceph_deploy.cli][INFO ] quiet : False
[ ceph_deploy.cli][INFO ] cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7f917755ca28>[ceph_deploy.cli][INFO ] cluster : ceph
[ ceph_deploy.cli][INFO ] ssh_copykey : True
[ ceph_deploy.cli][INFO ] mon :['ceph-1','ceph-2','ceph-3']....
ceph_deploy.new][WARNIN] could not connect via SSH
[ ceph_deploy.new][INFO ] will connect again with password prompt
The authenticity of host 'ceph-2 (192.168.57.223)' can't be established.
ECDSA key fingerprint is ef:e2:3e:38:fa:47:f4:61:b7:4d:d3:24:de:d4:7a:54.
Are you sure you want to continueconnecting(yes/no)? yes
Warning: Permanently added 'ceph-2,192.168.57.223'(ECDSA) to the list of knownhosts.
root
root@ceph-2's password:[ceph-2][DEBUG ] connected to host: ceph-2....[ceph_deploy.new][DEBUG ] Resolving host ceph-3[ceph_deploy.new][DEBUG ] Monitor ceph-3 at 192.168.57.224[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-1','ceph-2','ceph-3'][ceph_deploy.new][DEBUG ] Monitor addrs are ['192.168.57.222','192.168.57.223','192.168.57.224'][ceph_deploy.new][DEBUG ] Creating a random mon key...[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
At this time, the contents of the directory are as follows:
[ root@ceph-1cluster]# ls
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
Add public_network to ceph.conf according to your own IP configuration, and slightly increase the allowable range of time difference between mons (the default is 0.05s, now it is 2s):
[ root@ceph-1cluster]# echo public_network=192.168.57.0/24>> ceph.conf
[ root@ceph-1 cluster]# echo mon_clock_drift_allowed =2>> ceph.conf
[ root@ceph-1 cluster]# cat ceph.conf
[ global]
fsid = 0248817a-b758-4d6b-a217-11248b098e10
mon_initial_members = ceph-1, ceph-2, ceph-3
mon_host =192.168.57.222,192.168.57.223,192.168.57.224
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network=192.168.57.0/24
mon_clock_drift_allowed =2
Start to deploy monitor:
[ root@ceph-1cluster]# ceph-deploy mon create-initial
.... Some log
[ root@ceph-1 cluster]# ls
ceph.bootstrap-mds.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring
ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log
**View cluster status: **
[ root@ceph-1cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_ERR
no osds
Monitorclock skew detected
monmap e1:3 mons at{ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
electionepoch 6, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e1:0 osds:0 up,0in
flagssortbitwise
pgmap v2:64 pgs,1 pools,0 bytes data, 0objects
0 kB used, 0kB /0 kB avail
64 creating
Start to deploy OSD:
ceph-deploy--overwrite-conf osd prepare ceph-1:/dev/sdb ceph-1:/dev/sdc ceph-1:/dev/sddceph-2:/dev/sdb ceph-2:/dev/sdc ceph-2:/dev/sdd ceph-3:/dev/sdb ceph-3:/dev/sdcceph-3:/dev/sdd --zap-disk
ceph-deploy --overwrite-conf osd activate ceph-1:/dev/sdb1 ceph-1:/dev/sdc1ceph-1:/dev/sdd1 ceph-2:/dev/sdb1 ceph-2:/dev/sdc1 ceph-2:/dev/sdd1ceph-3:/dev/sdb1 ceph-3:/dev/sdc1 ceph-3:/dev/sdd1
I had a small problem during deployment. One OSD was unsuccessful (after all OSDs are deployed, the problem can be solved by redeploying the OSD). If nothing happens, the cluster status should be as follows:
[ root@ceph-1cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_WARN
too few PGsper OSD(21< min 30)
monmap e1:3 mons at{ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
electionepoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e45:9 osds:9 up,9in
flagssortbitwise
pgmap v82:64 pgs,1 pools,0 bytes data, 0objects
273 MB used,16335 GB /16336 GB avail
64 active+clean
To remove this WARN, just increase the PG of the rbd pool:
[ root@ceph-1cluster]# ceph osd pool set rbd pg_num 128set pool 0 pg_num to 128[root@ceph-1 cluster]# ceph osd pool set rbd pgp_num 128set pool 0 pgp_num to 128[root@ceph-1 cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_ERR
19 pgs arestuck inactive for more than 300 seconds
12 pgspeering
19 pgs stuckinactive
monmap e1:3 mons at{ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
electionepoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e49:9 osds:9 up,9in
flagssortbitwise
pgmap v96:128 pgs,1 pools,0 bytes data, 0objects
308 MB used,18377 GB /18378 GB avail
103 active+clean
12 peering
9 creating
4 activating
[ root@ceph-1 cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_OK
monmap e1:3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
electionepoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e49:9 osds:9 up,9in
flagssortbitwise
pgmap v99:128 pgs,1 pools,0 bytes data, 0objects
310 MB used,18377 GB /18378 GB avail
128 active+clean
At this point, the cluster deployment is complete.
Please do not use the method of directly modifying the /etc/ceph/ceph.conf file of a node, but modify it under the deployment node (here, ceph-1:/root/cluster/ceph.conf)
directory. Because when there are dozens of nodes, it is impossible to modify them one by one. The push method is fast and safe! After the modification, execute the following instructions to push the conf file to each node:
[ root@ceph-1cluster]# ceph-deploy --overwrite-conf config push ceph-1 ceph-2 ceph-3
At this point, you need to restart the monitor service of each node, see the next section.
# monitor start/stop/restart
# ceph-1 is the host name of the node where each monitor is located.
systemctl start [email protected]
systemctl restart [email protected]
systemctl stop [email protected]
# OSD start/stop/restart
#0 Is the id of the OSD of the node, which can be passed`ceph osd tree`View
systemctl start/stop/restart [email protected]
[ root@ceph-1 cluster]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWNREWEIGHT PRIMARY-AFFINITY
- 117.94685 rootdefault
- 25.98228 hostceph-101.99409 osd.0 up 1.000001.0000011.99409 osd.1 up 1.000001.0000081.99409 osd.2 up 1.000001.00000-35.98228 hostceph-221.99409 osd.3 up 1.000001.0000031.99409 osd.4 up 1.000001.0000041.99409 osd.5 up 1.000001.00000-45.98228 hostceph-351.99409 osd.6 up 1.000001.0000061.99409 osd.7 up 1.000001.0000071.99409 osd.8 up 1.000001.00000
Original from: TStack public account
Recommended Posts