一、安装要求

本文主要在centos7系统上基于containerd和v3.24.5版本的calico组件部署v1.29.3版本的堆叠ETCD高可用k8s原生集群

操作系统CentOS7.x, 内核5.4.x
硬件配置：2GB或更多RAM，2个CPU或更多CPU，硬盘30GB或更多
集群中的所有机器之间网络互通
可以访问外网要拉取镜像
禁止swap分区

IP	Hostname	cri
10.1.1.1	10-1-1-1.k8s.host	docker
10.1.1.2	10-1-1-2.k8s.host	docker
10.1.1.3	10-1-1-3.k8s.host	containerd
10.1.1.4	10-1-1-4.k8s.host	containerd
172.1.0.0/16	podSubnet
172.0.0.0/16	serviceSubnet

二、系统初始化

1 配置hosts

cat >> /etc/hosts <<EOF
10.1.1.1 10-1-1-1.k8s.host k8s-tst-cluster
10.1.1.2 10-1-1-2.k8s.host
10.1.1.3 10-1-1-3.k8s.host
10.1.1.4 10-1-1-4.k8s.host
EOF

2 关闭防火墙

systemctl stop firewalld && systemctl disable firewalld

3 关闭selinux

sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config

4 关闭swap分区

swapoff -a
sed -i '/swap / s/^$.*$$/#\1/g' /etc/fstab

5 设置内核参数

配置内核加载br_netfilter和iptables放行ipv6和ipv4的流量，确保集群内的容器能够正常通信。

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
vm.swappiness = 0
net.bridge.bridge-nf-call-iptables = 1
EOF


sudo sysctl --system

6 开启ipvs支持

# 在使用ipvs模式之前确保安装了ipset和ipvsadm
sudo yum install ipset ipvsadm -y

# 手动加载ipvs相关模块
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack

# 配置开机自动加载ipvs相关模块
cat <<EOF | sudo tee /etc/modules-load.d/ipvs.conf
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF


$ lsmod | grep -e ip_vs -e nf_conntrack
nf_conntrack_ipv6      18935  1
nf_defrag_ipv6         35104  1 nf_conntrack_ipv6
nf_conntrack_netlink    40492  0
nf_conntrack_ipv4      19149  3
nf_defrag_ipv4         12729  1 nf_conntrack_ipv4
ip_vs_sh               12688  0
ip_vs_wrr              12697  0
ip_vs_rr               12600  0
ip_vs                 145458  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          143360  10 ip_vs,nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_nat_masquerade_ipv4,nf_nat_masquerade_ipv6,nf_conntrack_netlink,nf_conntrack_ipv4,nf_conntrack_ipv6
libcrc32c              12644  4 xfs,ip_vs,nf_nat,nf_conntrack

$ cut -f1 -d " "  /proc/modules | grep -e ip_vs -e nf_conntrack
nf_conntrack_ipv6
nf_conntrack_netlink
nf_conntrack_ipv4
ip_vs_sh
ip_vs_wrr
ip_vs_rr
ip_vs
nf_conntrack

7 时间同步

1 2	yum -y install ntpdate ntpdate time.windows.com

8 安装cni-plugins

选择系统对应的架构版本

1	tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz

三、安装集群

1 cri

一般建议一个集群使用同一个cri,但是为了文档包含内容多些,这集群使用两个cri
并且建议用containerd

容器运行时	默认路径
containerd	`unix:///var/run/containerd/containerd.sock`
CRI-O	`unix:///var/run/crio/crio.sock`
Docker Engine（使用 cri-dockerd）	`unix:///var/run/cri-dockerd.sock`

1.1 安装Docker

1.1.1 配置阿里云Docker Yum源

sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

1.1.2 安装Docker最新版本

查看版本

yum list docker-ce --showduplicates | sort -r
sudo yum install -y docker-ce-20.10.24-3.el7

1.1.3 设置docker

cat >> /etc/docker/daemon.json  << EOF
{
 "registry-mirrors" : [
   "https://mirror.ccs.tencentyun.com",
   "http://registry.docker-cn.com",
   "http://docker.mirrors.ustc.edu.cn",
   "http://hub-mirror.c.163.com"
 ],
    "storage-driver": "overlay2",
    "log-opts": {
      "max-file": "2",
      "max-size": "256m"
    },
    "exec-opts": ["native.cgroupdriver=systemd"],
    "live-restore": true,
    "log-level": "info",
    "metrics-addr" : "127.0.0.1:9100",
    "experimental" : true,
    "data-root": "/data/docker"
}
EOF

1.1.4 启动Docker服务

systemctl enable docker
systemctl start docker

1.1.5 安装cri-dockerd

Kubernetes自v1.24移除了对docker-shim的支持，而Docker Engine默认又不支持CRI规范，因而二者将无法直接完成整合。
为此，Mirantis和Docker联合创建了cri-dockerd项目，用于为Docker Engine提供一个能够支持到CRI规范的垫片，从而能够让Kubernetes基于CRI控制Docker

可能下载不下来,建议翻墙下载离线安装选择对应包下载

yum -y install https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.2/cri-dockerd-0.3.2-3.el7.x86_64.rpm

1.1.6 指定 pause 镜像地址

cat /usr/lib/systemd/system/cri-docker.service
...
ExecStart=/usr/bin/cri-dockerd --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 --container-runtime-endpoint fd://
...

systemctl start cri-docker
systemctl enable cri-docker

1.2 containerd

1.2.1 配置内核

cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

1.2.2 安装

# 导入docker官方的yum源
sudo yum install -y yum-utils device-mapper-persistent-data lvm2

sudo yum-config-manager --add-repo  https://download.docker.com/linux/centos/docker-ce.repo

# 查看yum源中存在的各个版本的containerd.io
yum list containerd.io --showduplicates | sort -r

# 直接安装最新版本的containerd.io
yum install containerd.io -y

sudo systemctl start containerd
sudo systemctl enable --now containerd

1.2.3 配置cgroup drivers

CentOS7使用的是systemd来初始化系统并管理进程，初始化进程会生成并使用一个 root 控制组 (cgroup), 并充当 cgroup 管理器。 Systemd 与 cgroup 集成紧密，并将为每个 systemd 单元分配一个 cgroup。我们也可以配置容器运行时和 kubelet 使用 cgroupfs。连同 systemd 一起使用 cgroupfs 意味着将有两个不同的 cgroup 管理器。而当一个系统中同时存在cgroupfs和systemd两者时，容易变得不稳定，因此最好更改设置，令容器运行时和 kubelet 使用 systemd 作为 cgroup 驱动，以此使系统更为稳定。对于containerd, 需要设置配置文件/etc/containerd/config.toml中的 SystemdCgroup 参数。

参考k8s官方的说明文档：
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd-systemd

containerd config default > /etc/containerd/config.toml
# 修改SystemdCgroup参数并重启
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd
# 重启之后我们再检查配置就会发现已经启用了SystemdCgroup
containerd config dump | grep SystemdCgroup
            SystemdCgroup = true

2 安装kubernetes 组件

2.1 添加kubernetes阿里云yum源

cat >> /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

2.2 kubeadm、kubelet、kubectl

由于版本更新频繁，所以指定版本号部署
yum -y install kubelet-1.26.5 kubeadm-1.26.5 kubectl-1.26.5
systemctl enable kubelet

四、master节点初始化

配置参考

kubernetesVersion字段用来指定我们要安装的k8s版本
localAPIEndpoint参数需要修改为我们的master节点的IP和端口，初始化之后的k8s集群的apiserver地址就是这个
podSubnet、serviceSubnet和dnsDomain两个参数默认情况下可以不用修改，这里我按照自己的需求进行了变更
nodeRegistration里面的name参数修改为对应master节点的hostname
controlPlaneEndpoint参数配置的才是我们前面配置的集群高可用apiserver的地址
cri-socket: 通过 systemctl cat docker |grep run 查看
新增配置块使用ipvs，具体可以参考官方文档

出现问题解决不了清空执行: kubeadm reset -f ; ipvsadm --clear

cat kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 10.1.1.3
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/cri-dockerd.sock
  imagePullPolicy: IfNotPresent
  name: 10-1-1-3.k8s.host.ccops.com
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: tst-kubernetes
controlPlaneEndpoint: k8s-tst-cluster:6443
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /data/server/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.26.5
apiServer:
  extraArgs:
    audit-log-path: /var/log/k8s-audit.log
networking:
  dnsDomain: tst-cluster.local
  podSubnet: 172.1.0.0/16
  serviceSubnet: 172.0.0.0/16
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs

kubeadm init --config kubeadm-config.yaml  --upload-certs
# 保存以下结果,后面加节点需要
You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join k8s-tst-cluster:6443 --token 0j24t3.rqaulkmomt6u10ar \
 --discovery-token-ca-cert-hash sha256:5e35746495fcedb50c2db881604d43b46e5857d0493c1f9d0ad4acc35eadb8cf \
 --control-plane --certificate-key 7c2695c6791ee290ff70acca2e45e2a154f9b3728658dc513ffbdad14e7feead

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join k8s-tst-cluster:6443 --token 0j24t3.rqaulkmomt6u10ar \
 --discovery-token-ca-cert-hash sha256:5e35746495fcedb50c2db881604d43b46e5857d0493c1f9d0ad4acc35eadb8cf

1 配置kubectl

mkdir -p $HOME/.kube
sudo ln -s /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

1.1 自动补全

yum install bash-completion
echo "source /usr/share/bash-completion/bash_completion" >>  ~/.bashrc
echo 'source <(kubectl completion bash)' >>~/.bashrc
source ~/.bashrc
# 命令自定义
cat >>/root/.bashrc<<EOF
alias k=kubectl
complete -F __start_kubectl k
EOF
source /root/.bashrc

2 部署calico

CNI的部署我们参考官网的自建K8S部署教程，官网主要给出了两种部署方式，分别是通过Calico operator和Calico manifests来进行部署和管理calico，operator是通过deployment的方式部署一个calico的operator到集群中，再用其来管理calico的安装升级等生命周期操作。manifests则是将相关都使用yaml的配置文件进行管理，这种方式管理起来相对前者比较麻烦，但是对于高度自定义的K8S集群有一定的优势。

需要注意的是，Calico 维护的网络在默认配置下，是一个被称为“Node-to-Node Mesh”的模式。这时候，每台宿主机上的 BGP Client 都需要跟其他所有节点的 BGP Client 进行通信以便交换路由信息。但是，随着节点数量 N 的增加，这些连接的数量就会以 N²的规模快速增长，从而给集群本身的网络带来巨大的压力。
所以，Node-to-Node Mesh 模式一般推荐用在少于 100 个节点的集群里。而在更大规模的集群中，我们需要用到的是一个叫作 Route Reflector 的模式。参考: 如何优雅的使用Calico路由反射模式

这里我们使用operator的方式进行部署。

不要用apply

1
2

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/tigera-operator.yaml
curl https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/custom-resources.yaml -O

2.1 修改配置

主要修改cidr为kubeadm里配置的podSubnet


cat custom-resources.yaml
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  # Configures Calico networking.
  calicoNetwork:
    # Note: The ipPools section cannot be modified post-install.
    ipPools:
    - blockSize: 26
      cidr: 172.1.0.0/16
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()
---

# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
  name: default
spec: {}

kubectl create -f custom-resources.yaml

五、节点初始化

1 管理节点

接下来我们在剩下的两个master节点上面执行上面输出的命令，注意要执行带有--control-plane --certificate-key这两个参数的命令，其中--control-plane参数是确定该节点为master控制面节点，而--certificate-key参数则是把我们前面初始化集群的时候通过--upload-certs上传到k8s集群中的证书下载下来使用。

多网卡建议指定apiserver-advertise-address

kubeadm join k8s-tst-cluster:6443 --token zjoy9h.62mwhybt502sye7f \
--discovery-token-ca-cert-hash sha256:1640dcfc5e996864c5ebcfffd8051e8302f8cf455d09fd3cea6de4a299fea770 \
--control-plane --certificate-key 9d0da8e47ae6c1216ed5d464dabd18f1f3913ee6610a7bb20eabb7d5e5f5fb64 --cri-socket=unix:///var/run/cri-dockerd.sock --apiserver-advertise-address=10.1.1.1
This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

 mkdir -p $HOME/.kube
 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
 sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

2 node节点

kubeadm join k8s-tst-cluster:6443 --token zjoy9h.62mwhybt502sye7f \
 --discovery-token-ca-cert-hash sha256:1640dcfc5e996864c5ebcfffd8051e8302f8cf455d09fd3cea6de4a299fea770
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

3 查看集群状态

kubectl get node
NAME                                 STATUS   ROLES           AGE   VERSION
10-1-1-1.k8s.host.ccops.com   Ready    control-plane   38m   v1.26.5
10-1-1-2.k8s.host.ccops.com   Ready    control-plane   30m   v1.26.5
10-1-1-3.k8s.host.ccops.com   Ready    control-plane   44m   v1.26.5
10-1-1-4.k8s.host.ccops.com   Ready    <none>          48s   v1.26.5
[root@10-1-1-3 calico]# kubectl get po -A
NAMESPACE         NAME                                                         READY   STATUS                  RESTARTS       AGE
calico-system     calico-kube-controllers-85f9866f46-k8wjt                     1/1     Running                 0              14m
calico-system     calico-node-5wrpz                                            1/1     Running                 0              54s
calico-system     calico-node-bht57                                            1/1     Running                 0              28m
calico-system     calico-node-plqj7                                            1/1     Running                 0              28m
calico-system     calico-node-ww7bk                                            1/1     Running                 0              28m
calico-system     calico-typha-767bbcc4f4-twbd7                                1/1     Running                 0              28m
calico-system     calico-typha-767bbcc4f4-zmfbb                                1/1     Running                 0              53s
calico-system     csi-node-driver-7mjrt                                        2/2     Running                 0              28m
calico-system     csi-node-driver-gqgtr                                        2/2     Running                 0              28m
calico-system     csi-node-driver-q2gh2                                        2/2     Running                 0              28m
calico-system     csi-node-driver-xp9g6                                        2/2     Running                 0              54s
kube-system       coredns-5bbd96d687-lw5bk                                     1/1     Running                 0              44m
kube-system       coredns-5bbd96d687-rtjrd                                     1/1     Running                 0              44m
kube-system       etcd-10-1-1-1.k8s.host.ccops.com                            1/1     Running                 0              33m
kube-system       etcd-10-1-1-2.k8s.host.ccops.com                            1/1     Running                 0              30m
kube-system       etcd-10-1-1-3.k8s.host.ccops.com                            1/1     Running                 0              44m
kube-system       kube-apiserver-10-1-1-1.k8s.host.ccops.com                  1/1     Running                 0              34m
kube-system       kube-apiserver-10-1-1-2.k8s.host.ccops.com                  1/1     Running                 0              30m
kube-system       kube-apiserver-10-1-1-3.k8s.host.ccops.com                  1/1     Running                 0              44m
kube-system       kube-controller-manager-10-1-1-1.k8s.host.ccops.com         1/1     Running                 0              33m
kube-system       kube-controller-manager-10-1-1-2.k8s.host.ccops.com         1/1     Running                 0              29m
kube-system       kube-controller-manager-10-1-1-3.k8s.host.ccops.com         1/1     Running                 0              44m
kube-system       kube-proxy-78wrg                                             1/1     Running                 0              38m
kube-system       kube-proxy-jvfnd                                             1/1     Running                 0              54s
kube-system       kube-proxy-pbv2m                                             1/1     Running                 0              44m
kube-system       kube-proxy-ztx74                                             1/1     Running                 0              30m
kube-system       kube-scheduler-10-1-1-1.k8s.host.ccops.com                  1/1     Running                 0              33m
kube-system       kube-scheduler-10-1-1-2.k8s.host.ccops.com                  1/1     Running                 0              30m
kube-system       kube-scheduler-10-1-1-3.k8s.host.ccops.com                  1/1     Running                 0              44m
tigera-operator   tigera-operator-5d6845b496-zvjcm                             1/1     Running                 0              29m