如何更新 kubernetes 证书

由于 k8s API server 的证书会在部署一年后过期,并将导致 OpenPAI 的集群不可访问,因此需要在即将到期的时候对其进行更新。 请参考 使用 kubeadm 进行证书管理,以获取更详细的信息。

证书过期的提醒

如果管理员配置好了cert expiration checker, 预设的收件人将会在证书即将过期的时候,受到邮件提醒。 如果证书已经过期,用户则会在OpenPAI的网页上看到如下错误提示: cert expired

创建新的证书和令牌

在master节点上创建新的证书

在master节点上执行以下指令,以创建新的证书:

# On master - See https://kubernetes.io/docs/setup/certificates/#all-certificates
sudo kubeadm alpha certs renew apiserver
sudo kubeadm alpha certs renew apiserver-etcd-client
sudo kubeadm alpha certs renew apiserver-kubelet-client
sudo kubeadm alpha certs renew front-proxy-client

创建新的kube-configs

在master节点上执行以下指令,以创建新的kube-configs:

sudo kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin  > admin.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > controller-manager.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-scheduler > scheduler.conf

# chown and chmod so they match existing files
# please replace <user> to your current user name (e.g. root, core)
sudo chown <user> {admin,controller-manager,kubelet,scheduler}.conf
sudo chmod 600 {admin,controller-manager,kubelet,scheduler}.conf

# Move to replace existing kubeconfigs
sudo mv admin.conf /etc/kubernetes/
sudo mv controller-manager.conf /etc/kubernetes/
sudo mv kubelet.conf /etc/kubernetes/
sudo mv scheduler.conf /etc/kubernetes/

# Restart the master components
sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

# Verify master component certificates - should all be 1 year in the future
# Cert from api-server
echo -n | openssl s_client -connect localhost:6443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from controller manager
echo -n | openssl s_client -connect localhost:10257 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from scheduler
echo -n | openssl s_client -connect localhost:10259 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

创建新的kubelet证书

在master节点上执行以下指令,以创建新的kubelet.conf文件:

sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
# please replace <user> to your current user name (e.g. root, core)
sudo chown <user> kubelet.conf
sudo chmod 600 kubelet.conf

# Stop kubelet
sudo systemctl stop kubelet
# Delete files
sudo rm /var/lib/kubelet/pki/*
# Copy file
sudo mv kubelet.conf /etc/kubernetes/
# Restart
sudo systemctl start kubelet
# Uncordon
kubectl uncordon $(hostname)

# Check kubelet
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not

为worker节点创建新的令牌

在master节点上执行以下指令,以创建新的令牌:

sudo kubeadm token create

更新worker节点上的证书

使用playbook来对worker节点们进行批量更新操作,根据以下内容创建一个名为renew-worker-certs.yaml的文件,并用上一步生成的令牌替换<The generated token in above step>:

---
- hosts: all
  tasks:
    - name: join k8s
      shell: |
        systemctl stop kubelet
        rm /etc/kubernetes/kubelet.conf
        rm /var/lib/kubelet/pki/*
        sed -i "s/token: .*/token: <The generated token in above step>/" /etc/kubernetes/bootstrap-kubelet.conf
        systemctl start kubelet

如果你没有保存集群的 hosts.yml 文件,请在OpenPAI源代码中执行以下指令,来生成一个新的:

contrib/kubespray/script/k8s_generator.py -l layout.yaml -c config.yaml -o <output_folder>

然后执行以下指令,来更新worker节点的证书:

ansible-playbook -i hosts.yml --limit '!master-node' --become --become-user root renew-worker-cert.yaml

更新结束后删除令牌

在master节点上删除令牌,如果不执行这一步,令牌依然会在24小时后失效。

# On master node
sudo kubeadm token delete TOKEN-FROM-CREATION-ON-MASTER