一、安装要求

本文主要在centos7系统上基于containerdv3.24.5版本的calico组件部署v1.29.3版本的堆叠ETCD高可用k8s原生集群

  • 操作系统CentOS7.x, 内核5.4.x
  • 硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
  • 集群中的所有机器之间网络互通
  • 可以访问外网 要拉取镜像
  • 禁止swap分区
IP Hostname cri
10.1.1.1 10-1-1-1.k8s.host docker
10.1.1.2 10-1-1-2.k8s.host docker
10.1.1.3 10-1-1-3.k8s.host containerd
10.1.1.4 10-1-1-4.k8s.host containerd
172.1.0.0/16 podSubnet
172.0.0.0/16 serviceSubnet

二、系统初始化

1 配置hosts

1
2
3
4
5
6
cat >> /etc/hosts <<EOF
10.1.1.1 10-1-1-1.k8s.host k8s-tst-cluster
10.1.1.2 10-1-1-2.k8s.host
10.1.1.3 10-1-1-3.k8s.host
10.1.1.4 10-1-1-4.k8s.host
EOF

2 关闭防火墙

systemctl stop firewalld && systemctl disable firewalld

3 关闭selinux

sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config

4 关闭swap分区

swapoff -a
sed -i '/swap / s/^\(.*\)$/#\1/g' /etc/fstab

5 设置内核参数

配置内核加载br_netfilteriptables放行ipv6ipv4的流量,确保集群内的容器能够正常通信。

1
2
3
4
5
6
7
8
9
10
11
12
13
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
net.bridge.bridge-nf-call-iptables = 1
EOF


sudo sysctl --system

6 开启ipvs支持

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 在使用ipvs模式之前确保安装了ipset和ipvsadm
sudo yum install ipset ipvsadm -y

# 手动加载ipvs相关模块
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack

# 配置开机自动加载ipvs相关模块
cat <<EOF | sudo tee /etc/modules-load.d/ipvs.conf
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF


$ lsmod | grep -e ip_vs -e nf_conntrack
nf_conntrack_ipv6 18935 1
nf_defrag_ipv6 35104 1 nf_conntrack_ipv6
nf_conntrack_netlink 40492 0
nf_conntrack_ipv4 19149 3
nf_defrag_ipv4 12729 1 nf_conntrack_ipv4
ip_vs_sh 12688 0
ip_vs_wrr 12697 0
ip_vs_rr 12600 0
ip_vs 145458 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 143360 10 ip_vs,nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_nat_masquerade_ipv4,nf_nat_masquerade_ipv6,nf_conntrack_netlink,nf_conntrack_ipv4,nf_conntrack_ipv6
libcrc32c 12644 4 xfs,ip_vs,nf_nat,nf_conntrack

$ cut -f1 -d " " /proc/modules | grep -e ip_vs -e nf_conntrack
nf_conntrack_ipv6
nf_conntrack_netlink
nf_conntrack_ipv4
ip_vs_sh
ip_vs_wrr
ip_vs_rr
ip_vs
nf_conntrack

7 时间同步

1
2
yum -y install ntpdate
ntpdate time.windows.com

8 安装cni-plugins

选择系统对应的架构版本

1
tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz

三、安装集群

1 cri

一般建议一个集群使用同一个cri,但是为了文档包含内容多些,这集群使用两个cri
并且建议用containerd

容器运行时 默认路径
containerd unix:///var/run/containerd/containerd.sock
CRI-O unix:///var/run/crio/crio.sock
Docker Engine(使用 cri-dockerd) unix:///var/run/cri-dockerd.sock

1.1 安装Docker

1.1.1 配置阿里云Docker Yum源

sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

1.1.2 安装Docker最新版本

查看版本

yum list docker-ce --showduplicates | sort -r
sudo yum install -y docker-ce-20.10.24-3.el7

1.1.3 设置docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
cat >> /etc/docker/daemon.json  << EOF
{
"registry-mirrors" : [
"https://mirror.ccs.tencentyun.com",
"http://registry.docker-cn.com",
"http://docker.mirrors.ustc.edu.cn",
"http://hub-mirror.c.163.com"
],
"storage-driver": "overlay2",
"log-opts": {
"max-file": "2",
"max-size": "256m"
},
"exec-opts": ["native.cgroupdriver=systemd"],
"live-restore": true,
"log-level": "info",
"metrics-addr" : "127.0.0.1:9100",
"experimental" : true,
"data-root": "/data/docker"
}
EOF

1.1.4 启动Docker服务

systemctl enable docker
systemctl start docker

1.1.5 安装cri-dockerd

Kubernetes自v1.24移除了对docker-shim的支持,而Docker Engine默认又不支持CRI规范,因而二者将无法直接完成整合。
为此,Mirantis和Docker联合创建了cri-dockerd项目,用于为Docker Engine提供一个能够支持到CRI规范的垫片,从而能够让Kubernetes基于CRI控制Docker

可能下载不下来,建议翻墙下载离线安装 选择对应包下载

yum -y install https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.2/cri-dockerd-0.3.2-3.el7.x86_64.rpm

1.1.6 指定 pause 镜像地址

1
2
3
4
5
6
7
cat /usr/lib/systemd/system/cri-docker.service
...
ExecStart=/usr/bin/cri-dockerd --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 --container-runtime-endpoint fd://
...

systemctl start cri-docker
systemctl enable cri-docker

1.2 containerd

1.2.1 配置内核

1
2
3
4
5
6
7
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

1.2.2 安装

1
2
3
4
5
6
7
8
9
10
11
12
13
# 导入docker官方的yum源
sudo yum install -y yum-utils device-mapper-persistent-data lvm2

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

# 查看yum源中存在的各个版本的containerd.io
yum list containerd.io --showduplicates | sort -r

# 直接安装最新版本的containerd.io
yum install containerd.io -y

sudo systemctl start containerd
sudo systemctl enable --now containerd

1.2.3 配置cgroup drivers

CentOS7使用的是systemd来初始化系统并管理进程,初始化进程会生成并使用一个 root 控制组 (cgroup), 并充当 cgroup 管理器。 Systemd 与 cgroup 集成紧密,并将为每个 systemd 单元分配一个 cgroup。 我们也可以配置容器运行时和 kubelet 使用 cgroupfs。 连同 systemd 一起使用 cgroupfs 意味着将有两个不同的 cgroup 管理器。而当一个系统中同时存在cgroupfs和systemd两者时,容易变得不稳定,因此最好更改设置,令容器运行时和 kubelet 使用 systemd 作为 cgroup 驱动,以此使系统更为稳定。 对于containerd, 需要设置配置文件/etc/containerd/config.toml中的 SystemdCgroup 参数。

参考k8s官方的说明文档:
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd-systemd

1
2
3
4
5
6
7
containerd config default > /etc/containerd/config.toml
# 修改SystemdCgroup参数并重启
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd
# 重启之后我们再检查配置就会发现已经启用了SystemdCgroup
containerd config dump | grep SystemdCgroup
SystemdCgroup = true

2 安装kubernetes 组件

2.1 添加kubernetes阿里云yum源

1
2
3
4
5
6
7
8
9
10
cat >> /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

2.2 kubeadm、kubelet、kubectl

由于版本更新频繁,所以指定版本号部署
yum -y install kubelet-1.26.5 kubeadm-1.26.5 kubectl-1.26.5
systemctl enable kubelet

四、master节点初始化

配置参考

  • kubernetesVersion字段用来指定我们要安装的k8s版本
  • localAPIEndpoint参数需要修改为我们的master节点的IP和端口,初始化之后的k8s集群的apiserver地址就是这个
  • podSubnetserviceSubnetdnsDomain两个参数默认情况下可以不用修改,这里我按照自己的需求进行了变更
  • nodeRegistration里面的name参数修改为对应master节点的hostname
  • controlPlaneEndpoint参数配置的才是我们前面配置的集群高可用apiserver的地址
  • cri-socket: 通过 systemctl cat docker |grep run 查看
  • 新增配置块使用ipvs,具体可以参考官方文档

出现问题解决不了清空执行: kubeadm reset -f ; ipvsadm --clear

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
cat kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.1.1.3
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
imagePullPolicy: IfNotPresent
name: 10-1-1-3.k8s.host.lenovo.com
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: tst-kubernetes
controlPlaneEndpoint: k8s-tst-cluster:6443
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /data/server/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.26.5
apiServer:
extraArgs:
audit-log-path: /var/log/k8s-audit.log
networking:
dnsDomain: tst-cluster.local
podSubnet: 172.1.0.0/16
serviceSubnet: 172.0.0.0/16
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

kubeadm init --config kubeadm-config.yaml --upload-certs
# 保存以下结果,后面加节点需要
You can now join any number of the control-plane node running the following command on each as root:

kubeadm join k8s-tst-cluster:6443 --token 0j24t3.rqaulkmomt6u10ar \
--discovery-token-ca-cert-hash sha256:5e35746495fcedb50c2db881604d43b46e5857d0493c1f9d0ad4acc35eadb8cf \
--control-plane --certificate-key 7c2695c6791ee290ff70acca2e45e2a154f9b3728658dc513ffbdad14e7feead

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join k8s-tst-cluster:6443 --token 0j24t3.rqaulkmomt6u10ar \
--discovery-token-ca-cert-hash sha256:5e35746495fcedb50c2db881604d43b46e5857d0493c1f9d0ad4acc35eadb8cf

1 配置kubectl

1
2
3
4
mkdir -p $HOME/.kube
sudo ln -s /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

1.1 自动补全

1
2
3
4
5
6
7
8
9
10
yum install bash-completion
echo "source /usr/share/bash-completion/bash_completion" >> ~/.bashrc
echo 'source <(kubectl completion bash)' >>~/.bashrc
source ~/.bashrc
# 命令自定义
cat >>/root/.bashrc<<EOF
alias k=kubectl
complete -F __start_kubectl k
EOF
source /root/.bashrc

2 部署calico

CNI的部署我们参考官网的自建K8S部署教程,官网主要给出了两种部署方式,分别是通过Calico operatorCalico manifests来进行部署和管理calico,operator是通过deployment的方式部署一个calico的operator到集群中,再用其来管理calico的安装升级等生命周期操作。manifests则是将相关都使用yaml的配置文件进行管理,这种方式管理起来相对前者比较麻烦,但是对于高度自定义的K8S集群有一定的优势。

需要注意的是,Calico 维护的网络在默认配置下,是一个被称为“Node-to-Node Mesh”的模式。这时候,每台宿主机上的 BGP Client 都需要跟其他所有节点的 BGP Client 进行通信以便交换路由信息。但是,随着节点数量 N 的增加,这些连接的数量就会以 N²的规模快速增长,从而给集群本身的网络带来巨大的压力。
所以,Node-to-Node Mesh 模式一般推荐用在少于 100 个节点的集群里。而在更大规模的集群中,我们需要用到的是一个叫作 Route Reflector 的模式。参考: 如何优雅的使用Calico路由反射模式

这里我们使用operator的方式进行部署。

不要用apply

1
2
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/tigera-operator.yaml
curl https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/custom-resources.yaml -O

2.1 修改配置

主要修改cidr为kubeadm里配置的podSubnet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

cat custom-resources.yaml
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
cidr: 172.1.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---

# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}

kubectl create -f custom-resources.yaml

五、节点初始化

1 管理节点

接下来我们在剩下的两个master节点上面执行上面输出的命令,注意要执行带有--control-plane --certificate-key这两个参数的命令,其中--control-plane参数是确定该节点为master控制面节点,而--certificate-key参数则是把我们前面初始化集群的时候通过--upload-certs上传到k8s集群中的证书下载下来使用。

多网卡建议指定apiserver-advertise-address

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
kubeadm join k8s-tst-cluster:6443 --token zjoy9h.62mwhybt502sye7f \
--discovery-token-ca-cert-hash sha256:1640dcfc5e996864c5ebcfffd8051e8302f8cf455d09fd3cea6de4a299fea770 \
--control-plane --certificate-key 9d0da8e47ae6c1216ed5d464dabd18f1f3913ee6610a7bb20eabb7d5e5f5fb64 --cri-socket=unix:///var/run/cri-dockerd.sock --apiserver-advertise-address=10.1.1.1
This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

2 node节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
kubeadm join k8s-tst-cluster:6443 --token zjoy9h.62mwhybt502sye7f \
--discovery-token-ca-cert-hash sha256:1640dcfc5e996864c5ebcfffd8051e8302f8cf455d09fd3cea6de4a299fea770
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

3 查看集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
kubectl get node
NAME STATUS ROLES AGE VERSION
10-1-1-1.k8s.host.lenovo.com Ready control-plane 38m v1.26.5
10-1-1-2.k8s.host.lenovo.com Ready control-plane 30m v1.26.5
10-1-1-3.k8s.host.lenovo.com Ready control-plane 44m v1.26.5
10-1-1-4.k8s.host.lenovo.com Ready <none> 48s v1.26.5
[root@10-1-1-3 calico]# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-85f9866f46-k8wjt 1/1 Running 0 14m
calico-system calico-node-5wrpz 1/1 Running 0 54s
calico-system calico-node-bht57 1/1 Running 0 28m
calico-system calico-node-plqj7 1/1 Running 0 28m
calico-system calico-node-ww7bk 1/1 Running 0 28m
calico-system calico-typha-767bbcc4f4-twbd7 1/1 Running 0 28m
calico-system calico-typha-767bbcc4f4-zmfbb 1/1 Running 0 53s
calico-system csi-node-driver-7mjrt 2/2 Running 0 28m
calico-system csi-node-driver-gqgtr 2/2 Running 0 28m
calico-system csi-node-driver-q2gh2 2/2 Running 0 28m
calico-system csi-node-driver-xp9g6 2/2 Running 0 54s
kube-system coredns-5bbd96d687-lw5bk 1/1 Running 0 44m
kube-system coredns-5bbd96d687-rtjrd 1/1 Running 0 44m
kube-system etcd-10-1-1-1.k8s.host.lenovo.com 1/1 Running 0 33m
kube-system etcd-10-1-1-2.k8s.host.lenovo.com 1/1 Running 0 30m
kube-system etcd-10-1-1-3.k8s.host.lenovo.com 1/1 Running 0 44m
kube-system kube-apiserver-10-1-1-1.k8s.host.lenovo.com 1/1 Running 0 34m
kube-system kube-apiserver-10-1-1-2.k8s.host.lenovo.com 1/1 Running 0 30m
kube-system kube-apiserver-10-1-1-3.k8s.host.lenovo.com 1/1 Running 0 44m
kube-system kube-controller-manager-10-1-1-1.k8s.host.lenovo.com 1/1 Running 0 33m
kube-system kube-controller-manager-10-1-1-2.k8s.host.lenovo.com 1/1 Running 0 29m
kube-system kube-controller-manager-10-1-1-3.k8s.host.lenovo.com 1/1 Running 0 44m
kube-system kube-proxy-78wrg 1/1 Running 0 38m
kube-system kube-proxy-jvfnd 1/1 Running 0 54s
kube-system kube-proxy-pbv2m 1/1 Running 0 44m
kube-system kube-proxy-ztx74 1/1 Running 0 30m
kube-system kube-scheduler-10-1-1-1.k8s.host.lenovo.com 1/1 Running 0 33m
kube-system kube-scheduler-10-1-1-2.k8s.host.lenovo.com 1/1 Running 0 30m
kube-system kube-scheduler-10-1-1-3.k8s.host.lenovo.com 1/1 Running 0 44m
tigera-operator tigera-operator-5d6845b496-zvjcm 1/1 Running 0 29m