一, 备份ETCD

1 备份脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
cat /data/etcd_backup_dir/etcd_backup.sh
#!/bin/bash
date;
CACERT="/etc/ssl/etcd/ssl/ca.pem"
CERT="/etc/ssl/etcd/ssl/admin-${hostname}.pem" # 证书路径,因为我的文件名里有主机名这里些好吃呢个变量
EKY="/etc/ssl/etcd/ssl/admin-${hostname}-key.pem"
IP=$(hostname -I | awk '{print $1}')
ENDPOINTS="${IP}:2379"
ETCDCTL_API=3 etcdctl \
--cacert="${CACERT}" --cert="${CERT}" --key="${EKY}" \
--endpoints=${ENDPOINTS} \
snapshot save /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db
# 备份到/data/etcd_backup_dir/下
# 备份保留30天
find /data/etcd_backup_dir/ -name *.db -mtime +30 -exec rm -f {} \;

2 定时任务

1
2
crontab -e
0 23 * * 6 sh /data/etcd_backup_dir/etcd_backup.sh

二, 使用Velero备份

主要优势
  • 部分恢复而非全集群恢复
  • 跨集群迁移资源
  • 更灵活的备份策略管理
  • 选择性备份特定资源

Velero包含:在集群上运行的服务器与本地运行的命令行客户端这两部分

1 客户端安装

选择最新版本下载

1
2
3
4
5
6
7
8
9
tar -zxvf velero-v1.16.0-linux-amd64.tar.gz
mv velero-v1.16.0-linux-amd64/velero /usr/bin/
velero version
Client:
Version: v1.16.0
Git commit: 8f31599fe4af5453dee032beaf8a16bd75de91a5
<error getting server version: no matches for kind "ServerStatusRequest" in version "velero.io/v1">
# 命令补全
velero completion bash >/etc/profile.d/velero.sh

2 服务端安装

2.1 镜像准备

1
2
docker pull ccr.ccs.tencentyun.com/ccops/all:velero-v1.16.0
docker pull ccr.ccs.tencentyun.com/ccops/all:velero-plugin-for-aws-v1.12.0

2.2 存储准备

只要是符合s3协议即可,官方带了个minio部署到k8s的yaml

带的这个没做数据持久化,并且还是弱密码,修改下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
cat velero-v1.16.0-linux-amd64/examples/minio/00-minio-deployment.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: velero

---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: velero
name: minio
labels:
component: minio
spec:
strategy:
type: Recreate
selector:
matchLabels:
component: minio
template:
metadata:
labels:
component: minio
spec:
nodeSelector:
minio.storage.sys: "yes"
volumes:
- name: storage
hostPath:
path: /data/server/velero-minio/data
type: DirectoryOrCreate
- name: config
hostPath:
path: /data/server/velero-minio/config
type: DirectoryOrCreate
containers:
- name: minio
image: quay.io/minio/minio:latest
imagePullPolicy: IfNotPresent
args:
- server
- /storage
- --config-dir=/config
env:
- name: MINIO_ACCESS_KEY
value: "velero-minio"
- name: MINIO_SECRET_KEY
value: "1qaz@WSX"
ports:
- containerPort: 9000
volumeMounts:
- name: storage
mountPath: "/storage"
- name: config
mountPath: "/config"

---
apiVersion: v1
kind: Service
metadata:
namespace: velero
name: minio
labels:
component: minio
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
selector:
component: minio

---
apiVersion: batch/v1
kind: Job
metadata:
namespace: velero
name: minio-setup
labels:
component: minio
spec:
template:
metadata:
name: minio-setup
spec:
restartPolicy: OnFailure
nodeSelector:
minio.storage.sys: "yes"
volumes:
- name: config
hostPath:
path: /data/server/velero-minio/config
type: DirectoryOrCreate
containers:
- name: mc
image: quay.io/minio/mc:latest
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- "mc --config-dir=/config config host add velero http://minio:9000 velero-minio 1qaz@WSX && mc --config-dir=/config mb -p velero/velero"
volumeMounts:
- name: config
mountPath: "/config"
# 给节点打标签
kubectl label nodes <node-name> minio.storage.sys=yes

kubectl apply -f velero-v1.16.0-linux-amd64/examples/minio/00-minio-deployment.yaml
namespace/velero created
deployment.apps/minio created
service/minio created
job.batch/minio-setup created

kubectl get pod -n velero
NAME READY STATUS RESTARTS AGE
minio-cd47fcf59-sm2gn 1/1 Running 0 103s
minio-setup-smb9p 0/1 Completed 0 103s

2.3 安装服务端

建议存到文件里

网上都是直接命令行,如果以后升级不好维护,建议存储到文件里

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
cat velero-install.sh

#!/bin/bash
cat > credentials-velero << EOF
[default]
aws_access_key_id = velero-minio
aws_secret_access_key = 1qaz@WSX
EOF

velero install \
--image ccr.ccs.tencentyun.com/ccops/all:velero-v1.16.0 \
--plugins ccr.ccs.tencentyun.com/ccops/all:velero-plugin-for-aws-v1.12.0 \
--provider aws \
--bucket velero \
--namespace velero \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.velero:9000

sh velero-install.sh
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource
CustomResourceDefinition/backuprepositories.velero.io: attempting to create resource client
CustomResourceDefinition/backuprepositories.velero.io: created
......
Deployment/velero: attempting to create resource client
Deployment/velero: created
Velero is installed! Use 'kubectl logs deployment/velero -n velero' to view the status.
# 看到这行说明安装没问题
kubectl get pod -n velero
NAME READY STATUS RESTARTS AGE
minio-cd47fcf59-sm2gn 1/1 Running 0 36m
minio-setup-smb9p 0/1 Completed 0 36m
velero-76454754b9-fqqnw 1/1 Running 0 86s

3 测试

3.1 安装测试nginx

建议存到文件里

官网测试的nginx例子pull不下来镜像,自己需要手动改下

1
2
3
4
5
kubectl apply -f velero-v1.16.0-linux-amd64/examples/nginx-app/base.yaml
kubectl get pod -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-7f796bc6dc-hxqqd 1/1 Running 0 37s
nginx-deployment-7f796bc6dc-pjzxp 1/1 Running 0 21s

3.2 备份

1
2
3
velero backup create nginx-backup --selector app=nginx
Backup request "nginx-backup" submitted successfully.
Run `velero backup describe nginx-backup` or `velero backup logs nginx-backup` for more details.

3.3 还原

3.3.1 模拟丢失配置

1
2
3
kubectl delete namespace nginx-example
kubectl get ns nginx-example
Error from server (NotFound): namespaces "nginx-example" not found

3.3.2 还原

1
2
3
4
5
6
7
8
9
10
11
12
velero restore create --from-backup nginx-backup
Restore request "nginx-backup-20250423165832" submitted successfully.
Run `velero restore describe nginx-backup-20250423165832` or `velero restore logs nginx-backup-20250423165832` for more details.
# 查看还原
velero restore get
NAME BACKUP STATUS STARTED COMPLETED ERRORS WARNINGS CREATED SELECTOR
nginx-backup-20250423165832 nginx-backup Completed 2025-04-23 16:58:32 +0800 CST 2025-04-23 16:58:33 +0800 CST 0 1 2025-04-23 16:58:32 +0800 CST <none>
# 查看恢复的pod
kubectl get pod -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-7f796bc6dc-hxqqd 1/1 Running 0 23s
nginx-deployment-7f796bc6dc-pjzxp 1/1 Running 0 23s

4 高级备份功能

1
2
3
4
5
6
7
# 每天1点备份一次nginx-example空间资源
velero schedule create nginx-daily --schedule="0 1 * * *" --include-namespaces nginx-example
# 备份除"backup=ignore"这个标签的所有资源
velero backup create nginx-backup --selector 'backup notin (ignore)'
# 每周日0点备份一次所有集群资源,并且设置保留时间2160小时
velero create schedule cluster-all --schedule="0 0 * * 0" --ttl 2160h