1 准备工作

部署集群参考: RookCeph安装

1.1 创建一个 ceph pool 创建存储池

1
2
3
ceph osd pool create rbd 128
ceph osd pool set-quota rbd max_bytes $((20 * 1024 * 1024 * 1024)) #20G的存储池
rbd pool init rbd

1.1.1 查看集群状态

csi-config-map.yaml 会用到

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
ceph -s
cluster:
id: xxxxxxx-67a0f564e0d6
health: HEALTH_OK

services:
mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 7d)
mgr: ceph01.pjvndt(active, since 7d), standbys: ceph02.injlkl, ceph03.sulrio
mds: 1/1 daemons up, 2 standby
osd: 8 osds: 8 up (since 7d), 8 in (since 7d)
rgw: 3 daemons active (3 hosts, 1 zones)

data:
volumes: 1/1 healthy
pools: 10 pools, 265 pgs
objects: 269 objects, 8.3 KiB
usage: 20 GiB used, 400 GiB / 420 GiB avail
pgs: 265 active+clean

1.1.2 查看用户 key

csi-rbd-secret.yaml 文件会用到

1
2
3
4
5
6
7
8
ceph auth get client.admin
[client.admin]
key = xxxxxxxxMLkw==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"
exported keyring for client.admin

或者自己创建存储池、用户以及用户 key

1
2
3
4
5
ceph osd pool create kubernetes
rbd pool init kubernetes
ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes'
[client.kubernetes]
key = xxxxxxxpmg==

1.1.3 查看集群信息

csi-config-map.yaml 会用到

1
2
3
4
5
6
7
8
9
10
11
ceph mon dump
epoch 3
fsid 2a8e37c8-a0ef-11ec-bfb9-67a0f564e0d6
last_changed 2022-03-11T03:57:44.615026+0000
created 2022-03-11T03:56:03.739853+0000
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.1.1.1:3300/0,v1:10.1.1.1:6789/0] mon.ceph01
1: [v2:10.1.1.2:3300/0,v1:10.1.1.2:6789/0] mon.ceph02
2: [v2:10.1.1.3:3300/0,v1:10.1.1.3:6789/0] mon.ceph03
dumped monmap epoch 3

这里有两种方案

  • 所有操作都在 pod 里使用 rbd 命令,只能在快照卷和克隆卷用 ReadOnlyMany,但是部署方便
  • node 节点上安装 ceph-common,使用 ceph-common 挂载无限制,但是部署不方便

2 第一种方法(功能多,推荐)

2.1 生成 ceph-csi 的 kubernetes configmap

下载此目录所有文件
可以参考官网的,需要自己改镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
cat csi-config-map.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: "ceph-csi-config"
namespace: rbd
data:
config.json: |-
[
{
"clusterID": "2a8e37c8-a0ef-11ec-bfb9-67a0f564e0d6",
"monitors": [
"10.1.1.1:6789",
"10.1.1.2:6789",
"10.1.1.3:6789"
]
}
]

2.1.1 修改 kubectl 路径

1
2
3
grep plugins_registry *
csi-nodeplugin-psp.yaml: path: /var/lib/kubelet/plugins_registry/
csi-rbdplugin.yaml: - pathPrefix: '/var/lib/kubelet/plugins_registry'

sed -i "s/var\/lib/data/g" *

1
2
3
grep plugins_registry *
csi-nodeplugin-psp.yaml: - pathPrefix: '/data/kubelet/plugins_registry'
csi-rbdplugin.yaml: path: /data/kubelet/plugins_registry/

2.1.2 生成 ceph-csi cephx 的 secret

1
2
3
4
5
6
7
8
9
10
cat <<EOF > csi-rbd-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: rbd
stringData:
userID: admin
userKey: AQBTyCpiHHhbARAAfHqI0X9iMd3rnzJQHaMLkw==
EOF

2.1.3 部署 csi

kubectl apply -f . -n rbd

2.2 使用 ceph 块儿设备

2.2.1 创建 storageclass,注意 secret-name 要与上面 Secret 名字一致

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
clusterID: 2a8e37c8-a0ef-11ec-bfb9-67a0f564e0d6
pool: rbd
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
csi.storage.k8s.io/provisioner-secret-namespace: rbd
csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
csi.storage.k8s.io/controller-expand-secret-namespace: rbd
csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
csi.storage.k8s.io/node-stage-secret-namespace: rbd
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard
  • clusterID 对应之前的步骤中的 fsid

  • imageFeatures,这个是用来确定创建的 image 的特征的

  • allowVolumeExpansion: true 是否开启在线扩容

    kubectl apply -f sc.yaml -n rbd

2.2.2 查看 storageclass

1
2
3
kubectl get StorageClass -n rbd
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
csi-rbd-sc rbd.csi.ceph.com Delete Immediate true 23m

2.2.3 创建 PVC

1
2
3
4
5
6
7
8
9
10
11
12
cat pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc

kubectl aaply -f pvc.yaml

2.2.4 查看 pvc

1
2
3
kubectl get pvc -n rbd
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-d98c1d23-6e67-4185-82c5-d862b9768267 1Gi RWO csi-rbd-sc 2d15h

2.2.5 测试挂载 pod 写入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cat <<EOF > pod.yaml
kind: Pod
apiVersion: v1
metadata:
name: rbd-pod
namespace: rbd
spec:
containers:
- name: rbd-pod
image: nginx
volumeMounts:
- name: pvc
mountPath: "/mnt"
volumes:
- name: pvc
persistentVolumeClaim:
claimName: rbd-pvc
EOF
1
2
3
4
kubectl apply -f pod.yaml -n rbd
kubectl exec -it rbd-pod -n rbd -- sh
echo "hello" > /mnt/1
cat /mnt/1

2.3 快照的使用

2.3.1 安装 snapshot 驱动

1
./install-snapshot.sh install

2.3.2 创建 VolumeSnapshotClass

1
2
3
4
5
6
7
8
9
10
11
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-rbdplugin-snapclass
driver: rbd.csi.ceph.com
parameters:
clusterID: 2a8e37c8-a0ef-11ec-bfb9-67a0f564e0d6
snapshotNamePrefix: "test-" # 命名RBD快照的前缀
csi.storage.k8s.io/snapshotter-secret-name: csi-rbd-secret
csi.storage.k8s.io/snapshotter-secret-namespace: kube-system
deletionPolicy: Delete

2.3.3 创建 VolumeSnapshot

要与 pvc 在同一个命名空间

1
2
3
4
5
6
7
8
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: rbd-pvc-snapshot
spec:
volumeSnapshotClassName: csi-rbdplugin-snapclass # VolumeSnapshotClass名称
source:
persistentVolumeClaimName: rbd-pvc # pvc名称

2.3.4 查看快照

kubectl describe volumesnapshot rbd-pvc-snapshot

2.3.5 还原快照

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc-restore
spec:
storageClassName: csi-rbd-sc
dataSource:
name: rbd-pvc-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

3 第二种方法

3.1 安装 ceph 命令工具

每个节点都需要安装

  • ceph-common,不装会报没有挂载类型错误
  • ceph-fuse,不装会报找不到配置文件
1
2
3
4
5
6
7
rpm -ivh https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/ceph-release-1-1.el7.noarch.rpm
yum install -y yum-utils && \
yum-config-manager --add-repo https://dl.fedoraproject.org/pub/epel/7/x86_64/ && \
yum install --nogpgcheck -y epel-release && \
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 && \
rm -f /etc/yum.repos.d/dl.fedoraproject.org*
yum install ceph-common ceph-fuse

3.2 创建 storageclass

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
cat <<EOF > sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: intree-rbd
provisioner: kubernetes.io/rbd
parameters:
monitors: 10.1.1.1:6789,10.1.1.2:6789,10.1.1.3:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: kube-system
pool: rbd
userId: admin
userSecretName: ceph-secret
userSecretNamespace: kube-system
fsType: ext4
imageFormat: "2"
imageFeatures: "layering"
kubectl apply -f sc.yaml -n rbd
1
2
3
4
5
6
7
8
9
10
11
12
cat <<EOF > csi-rbd-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: csi-rbd-secret
namespace: rbd
type: kubernetes.io/rbd
stringData:
userID: admin
userKey: xxxx0X9iMd3rnzJQHaMLkw==
EOF
kubectl apply -f csi-rbd-secret.yaml -n rbd

3.3 使用 ceph 块儿设备

3.3.1 创建 PVC

1
2
3
4
5
6
7
8
9
10
11
12
cat pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: intree-rbd

kubectl aaply -f pvc.yaml

3.3.2 查看 pvc

1
2
3
kubectl get pvc -n rbd
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-d98c1d23-6e67-4185-82c5-d862b9768267 1Gi RWO csi-rbd-sc 2d15h

4 排错

4.1 查看日志

kubectl logs deployment.apps/csi-rbdplugin-provisioner -n rbd --all-containers=true --max-log-requests=7

4.2 查看 pvc 状态

kubectl describe pvc rbd-pvc -n rbd

4.3 报错找不到配置文件

1
2
3
4
5
6
7
8
9
10
 persistentvolume-controller  Failed to provision volume with StorageClass "intree-rbd": failed to create rbd image: exit status 2, command output: did not load config file, using default settings.
2022-04-24 15:12:36.652 7fa722bb6c80 -1 Errors while parsing config file!
2022-04-24 15:12:36.652 7fa722bb6c80 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2022-04-24 15:12:36.652 7fa722bb6c80 -1 parse_file: cannot open /.ceph/ceph.conf: (2) No such file or directory
2022-04-24 15:12:36.652 7fa722bb6c80 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
2022-04-24 15:12:36.652 7fa722bb6c80 -1 Errors while parsing config file!
2022-04-24 15:12:36.652 7fa722bb6c80 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2022-04-24 15:12:36.652 7fa722bb6c80 -1 parse_file: cannot open /.ceph/ceph.conf: (2) No such file or directory
2022-04-24 15:12:36.652 7fa722bb6c80 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
2022-04-24 15:12:36.680 7fa722bb6c80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  • 检查 key 与地址是否对,ceph auth get client.admin,ceph mon dump
  • 检查 ceph 是否有这个 pull,rados lspools