ceph的使用:命令行-hiyachen-ChinaUnix博客

架构师（云操作系统AI微服务）hiyachen.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

hiyachen

博客访问： 7180736
博文数量： 704
博客积分： 10821
博客等级：上将
技术积分： 12052
用户组：普通用户
注册时间： 2005-12-02 10:41

个人简介

中科院云平台架构师，专注于数字化、智能化，技术方向：云、Linux内核、AI、MES/ERP/CRM/OA、物联网、传感器、大数据、ML、微服务。

文章分类

全部博文（704）

云平台云计算（129）

未分类（0）

openstack（10）

分布式文件系统（3）

网络虚拟化（11）

容器云（1）

KVM-Libvirt（1）
大数据及数据挖掘（48）

spark（6）

算法（3）

hadoop（3）

mongodb（5）
Python（6）

python2（2）

python3（0）
linux-unix（72）

glusterrfs（8）

IPC（0）

文件系统（2）

AIX（1）

unix（34）
Java（170）

安全策略（4）

osgi（0）

AjAx（7）
数据库（70）

postgresql（0）

sqlite（0）

Redis(内存数据库（2）

Mysql（7）

Oracle_DB2_Sqlse（56）
Web（34）
网络与运维（19）

监控（3）
软件及系统架构（70）

金融（0）

移动开发（0）

UML（2）
中间件（2）

CICS（0）

Tuxedo（0）
C/C++（6）
PHP（3）
Others（19）

VBA||C#（3）

regExp（4）

Shell_Dos（5）
BPM（6）

JBPM（2）

工作流（4）
程序与人生（29）

Think（5）

Japan（3）
未分配的博文（21）

推荐博文

1.ceph相关命令

1)查看监控集群状态:

									ceph health
								
									ceph status
								
									ceph osd stat
								
									ceph osd dump
								
									ceph osd tree
								
									ceph mon dump
								
									ceph quorum_status
								
									ceph mds stat
								
									ceph mds dump

你可以分别试试看这些命令.
2)pools 大概可以理解为命名空间
查看已经存在的pools

									[root@test-2 ~]# ceph osd lspools
								
									0 data,1 metadata,2 rbd,

查看data pool中的pg_num属性

									[root@test-1 ~]# ceph osd pool get data pg_num
								
									pg_num: 256

查看data pool中的pgp_num属性

									[root@test-1 ~]# ceph osd pool get data pgp_num
								
									pgp_num: 256

创建一个pool ‘test-pool’

									[root@test-1 ~]# ceph osd pool create test-pool 256 256
								
									pool 'test-pool' created
								
									[root@test-1 ~]# ceph osd lspools
								
									0 data,1 metadata,2 rbd,3 test-pool,

删除 ‘test-pool’

									[root@test-1 ~]# ceph osd pool delete test-pool test-pool  --yes-i-really-really-mean-it
								
									pool 'test-pool' deleted
								
									[root@test-1 ~]# ceph osd lspools
								
									0 data,1 metadata,2 rbd,

3)CRUSH map相关
获取现有集群的crush map

									[root@test-1 ~]# ceph osd getcrushmap -o crush.map
								
									got crush map from osdmap epoch 734

反编译

									[root@test-1 ~]# cat crush.txt
								
									# begin crush map
								
									# devices
								
									device 0 osd.0
								
									device 1 osd.1
								
									device 2 osd.2
								
									# types
								
									type 0 osd
								
									type 1 host
								
									type 2 rack
								
									type 3 row
								
									type 4 room
								
									type 5 datacenter
								
									type 6 root
								
									# buckets
								
									host test-1 {
								
									        id -2           # do not change unnecessarily
								
									        # weight 1.000
								
									        alg straw
								
									        hash 0  # rjenkins1
								
									        item osd.0 weight 1.000
								
									}
								
									host test-2 {
								
									        id -4           # do not change unnecessarily
								
									        # weight 1.000
								
									        alg straw
								
									        hash 0  # rjenkins1
								
									        item osd.1 weight 1.000
								
									}
								
									host test-3 {
								
									        id -5           # do not change unnecessarily
								
									        # weight 1.000
								
									        alg straw
								
									        hash 0  # rjenkins1
								
									        item osd.2 weight 1.000
								
									}
								
									rack unknownrack {
								
									        id -3           # do not change unnecessarily
								
									        # weight 3.000
								
									        alg straw
								
									        hash 0  # rjenkins1
								
									        item test-1 weight 1.000
								
									        item test-2 weight 1.000
								
									        item test-3 weight 1.000
								
									}
								
									root default {
								
									        id -1           # do not change unnecessarily
								
									        # weight 3.000
								
									        alg straw
								
									        hash 0  # rjenkins1
								
									        item unknownrack weight 3.000
								
									}
								
									# rules
								
									rule data {
								
									        ruleset 0
								
									        type replicated
								
									        min_size 1
								
									        max_size 10
								
									        step take default
								
									        step chooseleaf firstn 0 type host
								
									        step emit
								
									}
								
									rule metadata {
								
									        ruleset 1
								
									        type replicated
								
									        min_size 1
								
									        max_size 10
								
									        step take default
								
									        step chooseleaf firstn 0 type host
								
									        step emit
								
									}
								
									rule rbd {
								
									        ruleset 2
								
									        type replicated
								
									        min_size 1
								
									        max_size 10
								
									        step take default
								
									        step chooseleaf firstn 0 type host
								
									        step emit
								
									}
								
									# end crush map

仔细观察这个输出信息,是不是发现了些什么有意思的事?请看官方文档的说明
当你修改好了以后编译crush map

									crushtool -c crush.txt -o crush.map
								

将这个生成的crush map设置到集群中

									ceph osd setcrushmap -i crush.map
								

2.ceph block device相关命令

1)基本操作
创建一个block device image

									[root@test-1 ~]# rbd create test-image --size 1024 --pool test-pool 
								
									[root@test-1 ~]# rbd ls test-pool
								
									test-image

查看这个image的详细信息

									[root@test-1 ~]# rbd --image test-image info --pool test-pool
								
									rbd image 'test-image':
								
									        size 1024 MB in 256 objects
								
									        order 22 (4096 kB objects)
								
									        block_name_prefix: rb.0.1483.6b8b4567
								
									        format: 1

删除这个image

									[root@test-1 ~]# rbd rm test-image -p test-pool
								
									Removing image: 100% complete...done.

2)Kernel Modules
有时候我们需要将image挂载到本地,同时修改image中的一些信息,这就需要用到了map操作.
首先我们需要在内核中载入rbd模块(请确保之前内核升级的时候已选上了rbd相关)

									modprobe rbd
								

map test-image

									rbd map test-image --pool test-pool --id admin
								

查看mapped的设备

									[root@test-1 mycephfs]# rbd showmapped
								
									id pool      image      snap device   
								
									1  test-pool test-image -    /dev/rbd1

我们看下/dev/rbd1的磁盘信息,然后mkfs,再挂载到/mnt/mycephfs目录下,在向里面创建一个包含’hello world’字符串的文件

									[root@test-1 ~]# fdisk -lu /dev/rbd1
								
									Disk /dev/rbd1: 1073 MB, 1073741824 bytes
								
									255 heads, 63 sectors/track, 130 cylinders, total 2097152 sectors
								
									Units = sectors of 1 * 512 = 512 bytes
								
									Sector size (logical/physical): 512 bytes / 512 bytes
								
									I/O size (minimum/optimal): 4194304 bytes / 4194304 bytes
								
									Disk identifier: 0x00000000
								
									[root@test-1 ~]# mkfs.ext4 /dev/rbd1
								
									mke2fs 1.41.12 (17-May-2010)
								
									Filesystem label=
								
									OS type: Linux
								
									Block size=4096 (log=2)
								
									Fragment size=4096 (log=2)
								
									Stride=1024 blocks, Stripe width=1024 blocks
								
									65536 inodes, 262144 blocks
								
									13107 blocks (5.00%) reserved for the super user
								
									First data block=0
								
									Maximum filesystem blocks=268435456
								
									8 block groups
								
									32768 blocks per group, 32768 fragments per group
								
									8192 inodes per group
								
									Superblock backups stored on blocks:
								
									        32768, 98304, 163840, 229376
								
									Writing inode tables: done                           
								
									Creating journal (8192 blocks): done
								
									Writing superblocks and filesystem accounting information: done
								
									This filesystem will be automatically checked every 33 mounts or
								
									180 days, whichever comes first.  Use tune2fs -c or -i to override.
								
									[root@test-1 ~]# mount /dev/rbd1 /mnt/mycephfs/       
								
									[root@test-1 ~]# ll /mnt/mycephfs/
								
									total 16
								
									drwx------ 2 root root 16384 Nov 27 13:40 lost+found
								
									[root@test-1 ~]# cd /mnt/mycephfs/
								
									[root@test-1 mycephfs]# ls
								
									lost+found
								
									[root@test-1 mycephfs]# echo 'hello' > hello.txt
								
									[root@test-1 mycephfs]# ls
								
									hello.txt  lost+found
								
									[root@test-1 mycephfs]# df -h /mnt/mycephfs/
								
									Filesystem            Size  Used Avail Use% Mounted on
								
									/dev/rbd1             976M  1.3M  908M   1% /mnt/mycephfs

我们同时也可以改变image的容量大小

									[root@test-1 mycephfs]# rbd resize --size 2048 test-image
								
									rbd: error opening image test-image: (2) No such file or directory
								
									2013-11-27 13:48:24.290564 7fcf3b185760 -1 librbd::ImageCtx: error finding header: (2) No such file or directory
								
									[root@test-1 mycephfs]# rbd resize --size 2048 test-image --pool test-pool
								
									Resizing image: 100% complete...done.
								
									[root@test-1 mycephfs]# df -h /mnt/mycephfs/            
								
									Filesystem            Size  Used Avail Use% Mounted on
								
									/dev/rbd1             976M  1.3M  908M   1% /mnt/mycephfs
								
									[root@test-1 mycephfs]# blockdev --getsize64 /dev/rbd1
								
									2147483648
								
									[root@test-1 mycephfs]# resize2fs /dev/rbd1
								
									resize2fs 1.41.12 (17-May-2010)
								
									Filesystem at /dev/rbd1 is mounted on /mnt/mycephfs; on-line resizing required
								
									old desc_blocks = 1, new_desc_blocks = 1
								
									Performing an on-line resize of /dev/rbd1 to 524288 (4k) blocks.
								
									The filesystem on /dev/rbd1 is now 524288 blocks long.
								
									[root@test-1 mycephfs]# df -h /mnt/mycephfs/
								
									Filesystem            Size  Used Avail Use% Mounted on
								
									/dev/rbd1             2.0G  1.6M  1.9G   1% /mnt/mycephfs
								
									[root@test-1 mycephfs]# ls
								
									hello.txt  lost+found

当我们修改完毕image内容后就可以unmap掉它了,之前你需要执行umount操作,当你下次map的时候之前创建的hello.txt依然会存在挂载目录下.

									[root@test-1 mnt]# umount /dev/rbd1
								
									[root@test-1 mnt]# rbd unmap /dev/rbd1

3)快照相关
有些时候我们需要对image进行snapshot操作,以便将来可以随时恢复到当时状态.
好我们对test-pool下的test-image进行snap操作

									[root@test-1 mnt]# rbd snap create test-pool/test-image@mysnap
								
									rbd: failed to create snapshot: (22) Invalid argument
								
									2013-11-27 14:56:53.109819 7f5bea81d760 -1 librbd: failed to create snap id: (22) Invalid argument

提示错误:Invalid argument,搞了半天才知道问题出在’test-pool’, ‘test-image’名字中的’-’上面,
我们新建个pool叫’mypool’同时在下面创建一个’myimage’

									[root@test-1 ceph]# ceph osd pool create mypool 256 256
								
									pool 'mypool' created
								
									[root@test-1 ceph]# rbd create myimage --size 1024 --pool mypool
								
									[root@test-1 ceph]# rbd --pool mypool ls
								
									myimage

好,接下来创建snap,快照名字叫’snapimage’

									[root@test-1 ceph]# rbd snap create mypool/myimage@snapimage
								

查看myimage的snap

									[root@test-1 ceph]# rbd snap ls mypool/myimage
								
									SNAPID NAME         SIZE
								
									     2 snapimage 1024 MB

接下来我们测试下这个snap吧

									[root@test-1 ceph]# rbd snap create mypool/myimage@snapimage3
								
									[root@test-1 ceph]# rbd map mypool/myimage
								
									[root@test-1 ceph]# mount /dev/rbd1 /mnt/mycephfs/
								
									[root@test-1 ceph]# ls /mnt/mycephfs/
								
									hello.txt  lost+found
								
									[root@test-1 ceph]# echo  'welcome to zhengtianbao.com ' > /mnt/mycephfs/info.txt
								
									[root@test-1 ceph]# ls /mnt/mycephfs/
								
									hello.txt  info.txt  lost+found
								
									[root@test-1 ceph]# umount /dev/rbd1
								
									[root@test-1 ceph]# rbd unmap /dev/rbd1
								
									[root@test-1 ceph]# rbd snap rollback mypool/myimage@snapimage3
								
									Rolling back to snapshot: 100% complete...done.
								
									[root@test-1 ceph]# rbd map mypool/myimage
								
									[root@test-1 ceph]# mount /dev/rbd1 /mnt/mycephfs/
								
									[root@test-1 ceph]# ls /mnt/mycephfs/
								
									hello.txt  lost+found

是不是如预计的那样myimage回到了snapimage3时候的状态,之后创建的info.txt已经消失了.
删除snap

									[root@test-1 ceph]# rbd snap ls mypool/myimage
								
									SNAPID NAME          SIZE
								
									     2 snapimage  1024 MB
								
									     3 snapimage2 1024 MB
								
									     4 snapimage3 1024 MB
								
									[root@test-1 ceph]# rbd snap rm mypool/myimage@snapimage
								
									[root@test-1 ceph]# rbd snap ls mypool/myimage         
								
									SNAPID NAME          SIZE
								
									     3 snapimage2 1024 MB
								
									     4 snapimage3 1024 MB

删除myimage的全部snapshot

									[root@test-1 ceph]# rbd snap purge mypool/myimage
								
									Removing all snapshots: 100% complete...done.

4)libvirt
与libvirt配合使用,libvirt中定义domain的device使用ceph block device.
关于libvirt,大体的就是一个中间层,与rbd配合使用的关系大概如下:

									libvirt-->qemu-->librbd-->librados-->osds
								
									                                |--->monitors

有关libvirt和qemu以后有机会再补上.
另外,请确保qemu在configure的时候enable rbd.
首先需要有一个制作好的镜像,我这里用centos6的一个镜像

									[root@test-1 ~]# file centos6
								

									centos6: x86 boot sector; GRand Unified Bootloader, stage1 
version 0x3, boot drive 0x80, 1st sector stage2 0x849d4, GRUB version 
0.94; partition 1: ID=0x83, active, starthead 32, startsector 2048, 
1024000 sectors; partition 2: ID=0x8e, starthead 221, startsector 
1026048, 19945472 sectors, code offset 0x48
								

通过qemu-img convert命令将这个镜像放置到mypool中,取名为centos

									[root@test-1 ceph]# qemu-img convert ~/centos6 rbd:mypool/centos
								
									[root@test-1 ceph]# rbd ls --pool mypool
								
									centos
								
									myimage
								
									[root@test-1 ceph]# rbd info centos --pool mypool
								
									rbd image 'centos':
								
									        size 10240 MB in 2560 objects
								
									        order 22 (4096 kB objects)
								
									        block_name_prefix: rb.0.14d4.6b8b4567
								
									        format: 1

然后我们创建一个libvirt需要用到的domain xml文件,这里只是个简单的例子
test.xml

									
									  test-ceph
								
									  4194304
								
									  4194304
								
									  4
								
									    hvm
								
									  destroy
								
									  restart
								
									  restart
								
									    /usr/libexec/qemu-kvm

接下来通过virsh命令创建虚拟机,查看vnc端口

									[root@test-1 ceph]# virsh define test.xml
								
									[root@test-1 ceph]# virsh list --all
								
									 Id    Name                           State
								
									----------------------------------------------------
								
									 -     test-ceph                      shut off
								
									[root@test-1 ceph]# virsh start test-ceph
								
									Domain test-ceph started
								
									[root@test-1 ceph]# virsh list
								
									 Id    Name                           State
								
									----------------------------------------------------
								
									 1     test-ceph                      running
								
									[root@test-1 ceph]# virsh vncdisplay 1
								
									:0

ok,现在我们可以通过vnc客户端连接到host:5900端口的虚拟机中进行操作了,同时你也可以在虚拟机中测试下ceph的读写性能如何…

一些链接:

[1]IBM关于ceph的说明: http://www.ibm.com/developerworks/cn/linux/l-ceph/
[2]ceph架构方面: http://www.ustack.com/blog/ceph_infra/
[3]ceph性能测试:

阅读(7624) | 评论(0) | 转发(1) |

上一篇：ceph存储、rados、sheepdog

下一篇：网络监测：iperf安装与使用

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6