Rook

Rook官网

个人笔记网站http://note.27ops.com

有部分镜像,国内可能拉去不了。可以找国内的加速镜像站拉去。 生产环境需要修改镜像时间,默认是UTC。要修改为CST。

镜像分享链接

https://share.weiyun.com/MnHSOLNc

在master节点执行:docker load -i rook-img-master.tar

https://rook.io
https://rook.io/docs/rook/v1.6/ceph-quickstart.html

服务器列表

IP地址 主机名 服务器角色 磁盘
192.168.31.10 kmaster 部署节点、mons、mgrs、osds、rgws、mdss、clients sda/sdb
192.168.31.11 knode01 mons、mgrs、osds、rgws、mdss、clients sda/sdb
192.168.31.12 knode02 mons、mgrs、osds、rgws、mdss、clients sda/sdb

操作系统版本:Rocky Linux release 8.3

[root@kmaster prometheus]# kubectl get node -o wide
NAME      STATUS     ROLES                  AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE        KERNEL-VERSION               CONTAINER-RUNTIME
kmaster   Ready      control-plane,master   21d   v1.21.0   192.168.31.10   <none>        Rocky Linux 8   4.18.0-240.22.1.el8.x86_64   docker://20.10.6
knode01   Ready      <none>                 21d   v1.21.0   192.168.31.11   <none>        Rocky Linux 8   4.18.0-240.22.1.el8.x86_64   docker://20.10.6
knode02   Ready      <none>                 21d   v1.21.0   192.168.31.12   <none>        Rocky Linux 8   4.18.0-240.22.1.el8.x86_64   docker://20.10.6
[root@kmaster prometheus]# 

磁盘列表

[root@kmaster ceph]# lsblk 
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    0  100G  0 disk 
├─sda1        8:1    0    1G  0 part /boot
└─sda2        8:2    0   99G  0 part 
  ├─rl-root 253:0    0 63.9G  0 lvm  /
  ├─rl-swap 253:1    0    4G  0 lvm  
  └─rl-home 253:2    0 31.2G  0 lvm  /home
sdb           8:16   0   50G  0 disk 
sr0          11:0    1  1.8G  0 rom  
[root@kmaster ceph]# 

清华ceph镜像源

[root@kmaster 3ingress]# cat ceph_tsinghua.repo 
[ceph_stable]
baseurl = https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-octopus/el8/$basearch
gpgcheck = 1
gpgkey = https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc
name = Ceph Stable $basearch repo
priority = 2

[ceph_stable_noarch]
baseurl = https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-octopus/el8/noarch
gpgcheck = 1
gpgkey = https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc
name = Ceph Stable noarch repo
priority = 2

[root@kmaster 3ingress]# 

获取rook源代码

root@kmaster ~# git clone --single-branch --branch v1.6.2 [https://github.com/rook/rook.git](https://github.com/rook/rook.git)
Cloning into 'rook'...
remote: Enumerating objects: 63522, done.
remote: Counting objects: 100% (520/520), done.
remote: Compressing objects: 100% (283/283), done.
remote: Total 63522 (delta 277), reused 377 (delta 218), pack-reused 63002
Receiving objects: 100% (63522/63522), 37.91 MiB | 11.09 MiB/s, done.
Resolving deltas: 100% (44391/44391), done.
Note: switching to 'e8fd65f0886be00978cae66d56cdb3d32dc2de26'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
  git switch -c <new-branch-name>
Or undo this operation with:
  git switch -
Turn off this advice by setting config variable advice.detachedHead to false
root@kmaster ~#

注:3节点部署的话,需要设置为master可调度

kubectl taint node kmaster node-role.kubernetes.io/master-

root@kmaster ceph# kubectl taint node kmaster node-role.kubernetes.io/master-
node/kmaster untainted
root@kmaster ceph# 

应用rook yaml

git clone --single-branch --branch v1.6.2 [https://github.com/rook/rook.git](https://github.com/rook/rook.git)
cd rook/cluster/examples/kubernetes/ceph
kubectl create -f crds.yaml -f common.yaml -f operator.yaml

拉起结果

[root@kmaster ceph]# kubectl create -f crds.yaml -f common.yaml -f operator.yaml
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclients.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystemmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectrealms.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzonegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzones.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephrbdmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/objectbucketclaims.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/objectbuckets.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/volumereplicationclasses.replication.storage.openshift.io created
customresourcedefinition.apiextensions.k8s.io/volumereplications.replication.storage.openshift.io created
customresourcedefinition.apiextensions.k8s.io/volumes.rook.io created
namespace/rook-ceph created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-object-bucket created
serviceaccount/rook-ceph-admission-controller created
clusterrole.rbac.authorization.k8s.io/rook-ceph-admission-controller-role created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-admission-controller-rolebinding created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
role.rbac.authorization.k8s.io/rook-ceph-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrole.rbac.authorization.k8s.io/rook-ceph-object-bucket created
serviceaccount/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
serviceaccount/rook-ceph-cmd-reporter created
role.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/00-rook-privileged created
clusterrole.rbac.authorization.k8s.io/psp:rook created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-default-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter-psp created
serviceaccount/rook-csi-cephfs-plugin-sa created
serviceaccount/rook-csi-cephfs-provisioner-sa created
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role created
serviceaccount/rook-csi-rbd-plugin-sa created
serviceaccount/rook-csi-rbd-provisioner-sa created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created
[root@kmaster ceph]#

配置cluster.yaml

vim  cluster.yaml

nodes:
- name: "kmaster"
  devices:
  - name: "sdb"
- name: "knode01"
  devices:
  - name: "sdb"
- name: "knode02"
  devices:
  - name: "sdb"
kubectl create -f cluster.yaml

operator很快拉起,所有的pod拉起需要一段时间。

[root@kmaster ceph]# kubectl get pod -n rook-ceph
NAME                                  READY   STATUS    RESTARTS   AGE
rook-ceph-operator-6459f5dc4b-wt4x9   1/1     Running   0          90m
[root@kmaster ceph]# 

通过operator的日志可以查看详细的信息

[root@kmaster install-k8s]# kubectl logs -f -n  rook-ceph    
...
 rook-ceph-operator-6459f5dc4b-wt4x9
2021-08-21 10:43:46.956759 I | op-mgr: successful modules: balancer
2021-08-21 10:43:47.353528 I | op-mgr: setting ceph dashboard "admin" login creds
2021-08-21 10:43:48.576915 I | op-mgr: successfully set ceph dashboard creds
2021-08-21 10:43:48.661574 I | op-osd: updating OSD 0 on node "kmaster"
2021-08-21 10:43:48.770004 I | op-osd: OSD orchestration status for node knode01 is "orchestrating"
2021-08-21 10:43:48.770268 I | op-osd: OSD orchestration status for node kmaster is "completed"
2021-08-21 10:43:48.781108 I | op-osd: OSD orchestration status for node knode01 is "completed"
2021-08-21 10:43:48.813739 I | op-osd: OSD orchestration status for node knode02 is "orchestrating"
2021-08-21 10:43:50.571987 I | op-osd: updating OSD 1 on node "knode01"
2021-08-21 10:43:50.608450 I | op-osd: OSD orchestration status for node knode02 is "completed"
2021-08-21 10:43:52.451878 I | op-osd: updating OSD 2 on node "knode02"
2021-08-21 10:43:52.597494 I | op-mgr: dashboard config has changed. restarting the dashboard module
2021-08-21 10:43:52.597515 I | op-mgr: restarting the mgr module
2021-08-21 10:43:53.411680 I | cephclient: successfully disallowed pre-octopus osds and enabled all new octopus-only functionality
2021-08-21 10:43:53.411730 I | op-osd: finished running OSDs in namespace "rook-ceph"
2021-08-21 10:43:53.411841 I | ceph-cluster-controller: done reconciling ceph cluster in namespace "rook-ceph"
2021-08-21 10:43:54.241276 I | op-mgr: successful modules: dashboard

部署完成,查看namespace下pod的状态

[root@kmaster ceph]# kubectl get  pod -n rook-ceph 
NAME                                                READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-4wr2d                              3/3     Running     0          6m56s
csi-cephfsplugin-nztms                              3/3     Running     0          6m9s
csi-cephfsplugin-provisioner-6f75644874-jxb6n       6/6     Running     0          3m38s
csi-cephfsplugin-provisioner-6f75644874-shszs       6/6     Running     0          6m56s
csi-cephfsplugin-rbvfd                              3/3     Running     0          6m56s
csi-rbdplugin-6h8nb                                 3/3     Running     0          6m57s
csi-rbdplugin-htpnp                                 3/3     Running     0          6m9s
csi-rbdplugin-jzf6n                                 3/3     Running     0          6m57s
csi-rbdplugin-provisioner-67fb987799-fxj68          6/6     Running     0          6m56s
csi-rbdplugin-provisioner-67fb987799-zzm2x          6/6     Running     0          3m18s
rook-ceph-crashcollector-kmaster-7596c6f695-z784j   1/1     Running     0          5m5s
rook-ceph-crashcollector-knode01-5c75d4cbc8-z45lb   1/1     Running     0          6m3s
rook-ceph-crashcollector-knode02-67d58f7c55-m8rvj   1/1     Running     0          5m55s
rook-ceph-mgr-a-cfdb8d4b8-kxxr9                     1/1     Running     0          5m29s
rook-ceph-mon-a-77dbfbb9b6-wlzbw                    1/1     Running     0          6m7s
rook-ceph-mon-b-65d59f4667-hvdbt                    1/1     Running     0          5m55s
rook-ceph-mon-c-9c8b69b9c-4x86g                     1/1     Running     0          5m42s
rook-ceph-operator-6459f5dc4b-pq8gc                 1/1     Running     0          7m20s
rook-ceph-osd-0-6dd858b9b5-xqlc8                    1/1     Running     0          5m5s
rook-ceph-osd-1-8596cc946c-cldp5                    1/1     Running     0          5m2s
rook-ceph-osd-2-76c758bd77-gpwx9                    1/1     Running     0          5m
rook-ceph-osd-prepare-kmaster-nfg56                 0/1     Completed   0          4m41s
rook-ceph-osd-prepare-knode01-wb6s8                 0/1     Completed   0          4m39s
rook-ceph-osd-prepare-knode02-wgklb                 0/1     Completed   0          4m36s
rook-ceph-tools-7467d8bf8-x7scq                     1/1     Running     0          47s
[root@kmaster ceph]# 

Rook Toolbox

https://rook.io/docs/rook/v1.6/ceph-toolbox.html
cd rook/cluster/examples/kubernetes/ceph
kubectl apply -f toolbox.yaml

集群状态

查看ceph集群状态

两用方式进行查看

第一种:使用kubectl exec -it podname(toolbox的pod实例名称)/bin/bash

在容器中使用ceph命令查看集群状态

此步骤为修复集群后执行结果。

[root@kmaster ceph]# kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[root@rook-ceph-tools-7467d8bf8-x7scq /]# ceph -s
  cluster:
    id:     0ad47b5f-e055-4448-b8b6-5ab5ccd57799
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 50m)
    mgr: a(active, since 49m)
    osd: 3 osds: 3 up (since 50m), 3 in (since 50m)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 132 GiB / 135 GiB avail
    pgs:     1 active+clean

[root@rook-ceph-tools-7467d8bf8-x7scq /]# 

第二种:使用kubectl exec -it podname(toolbox的pod实例名称) /bin/bash -n rook-ceph -- 集群操作命令 ,例如:

kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph -s 
kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph osd tree
kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph osd pool ls
[root@kmaster ceph]# kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph -s
  cluster:
    id:     0ad47b5f-e055-4448-b8b6-5ab5ccd57799
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim

  services:
    mon: 3 daemons, quorum a,b,c (age 7m)
    mgr: a(active, since 6m)
    osd: 3 osds: 3 up (since 6m), 3 in (since 6m)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 132 GiB / 135 GiB avail
    pgs:     1 active+clean

[root@kmaster ceph]#

解决办法

ceph config set mon auth_allow_insecure_global_id_reclaim false
[root@kmaster ceph]# kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph config set mon auth_allow_insecure_global_id_reclaim false
[root@kmaster ceph]#

再次查看集群状态

[root@kmaster ceph]# kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph -s
  cluster:
    id:     0ad47b5f-e055-4448-b8b6-5ab5ccd57799
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 8m)
    mgr: a(active, since 7m)
    osd: 3 osds: 3 up (since 7m), 3 in (since 7m)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 132 GiB / 135 GiB avail
    pgs:     1 active+clean

[root@kmaster ceph]# 

OSD查看

[root@kmaster ceph]# kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-1         0.13170  root default                               
-3         0.04390      host kmaster                           
 0    hdd  0.04390          osd.0         up   1.00000  1.00000
-5         0.04390      host knode01                           
 1    hdd  0.04390          osd.1         up   1.00000  1.00000
-7         0.04390      host knode02                           
 2    hdd  0.04390          osd.2         up   1.00000  1.00000
[root@kmaster ceph]# 
[root@rook-ceph-tools-7467d8bf8-x7scq /]# ceph osd status
ID  HOST      USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE      
 0  kmaster  1027M  43.9G      0        0       0        0   exists,up  
 1  knode01  1027M  43.9G      0        0       0        0   exists,up  
 2  knode02  1027M  43.9G      0        0       0        0   exists,up  
[root@rook-ceph-tools-7467d8bf8-x7scq /]# 

pool

[root@kmaster ceph]# kubectl exec -it rook-ceph-tools-7467d8bf8-x7scq /bin/bash -n rook-ceph -- ceph osd pool ls
device_health_metrics
[root@kmaster ceph]# 

创建pool

[root@rook-ceph-tools-7467d8bf8-x7scq /]# ceph osd pool create k8s 128 128
pool 'k8s' created
[root@rook-ceph-tools-7467d8bf8-x7scq /]# 

Dashboard

daskboard官网地址https://rook.io/docs/rook/v1.7/ceph-dashboard.html

[root@kmaster ceph]# kubeclt apply -f dashboard-external-https.yaml 
service/rook-ceph-mgr-dashboard-external-https created
[root@kmaster ceph]#
[root@kmaster ceph]# kubectl get svc -n rook-ceph |grep mgr-dashboard
rook-ceph-mgr-dashboard                  ClusterIP   10.98.30.42      <none>        8443/TCP            36m
rook-ceph-mgr-dashboard-external-http    NodePort    10.104.249.254   <none>        7000:30109/TCP      2m45s
rook-ceph-mgr-dashboard-external-https   NodePort    10.109.252.219   <none>        8443:30653/TCP      2m49s
[root@kmaster ceph]# 

在浏览器中访问https://192.168.31.10:30653(masterIP地址)

dashboard登录

默认用户名为:admin,查看密码命令

kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
[root@kmaster ceph]# kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
[:@-m{^m8SPZHS<[xkJM
[root@kmaster ceph]# 

ceph-dashboard资源状态信息

OSD

pool资源信息 config logs