旧版博客:https://egonlin.com/?p=6618
一、k8s包yum源介绍
阿里云关于k8s的镜像介绍文档地址:https://developer.aliyun.com/mirror/kubernetes/
简介
Kubernetes是一个开源系统,用于容器化应用的自动部署、扩缩和管理。 它将构成应用的容器按逻辑单位进行分组以便于管理和发现。 由于 Kubernetes 官方变更了仓库的存储路径以及使用方式, 如果需要使用 1.28 及以上版本,请使用 新版配置方法 进行配置。 下载地址:https://mirrors.aliyun.com/kubernetes/ 新版下载地址:https://mirrors.aliyun.com/kubernetes-new/
配置方法-新版配置方法
新版 kubernetes 源使用方法和之前有一定区别,请求按照如下配置方法配置使用。 其中新版 kubernetes 源按照安装版本区分不同仓库,该文档示例为配置 1.28 版本, 如需其他版本请在对应位置字符串替换即可。 (比如需要安装 1.29 版本,则需要将如下配置中的 v1.28 替换成 v1.29) (目前该源支持 v1.24 - v1.29 版本,后续版本会持续更新)
Debian / Ubuntu
apt-get update && apt-get install -y apt-transport-https curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/ /" | tee /etc/apt/sources.list.d/kubernetes.list apt-get update apt-get install -y kubelet kubeadm kubectl
CentOS / RHEL / Fedora
cat <<EOF | tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/rpm/ enabled=1 gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/rpm/repodata/repomd.xml.key EOF setenforce 0 yum install -y kubelet kubeadm kubectl systemctl enable kubelet && systemctl start kubelet
ps: 由于官网未开放同步方式, 可能会有索引gpg检查失败的情况, 这时请用 yum install -y --nogpgcheck kubelet kubeadm kubectl
安装
配置方法-旧版配置方法
目前由于kubernetes官方变更了仓库的存储路径以及使用方式,旧版 kubernetes 源只更新到 1.28 部分版本,
后续更新版本请使用 新源配置方法 进行配置。
Debian / Ubuntu
apt-get update && apt-get install -y apt-transport-https curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF apt-get update apt-get install -y kubelet kubeadm kubectl
CentOS / RHEL / Fedora
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF setenforce 0 yum install -y kubelet kubeadm kubectl systemctl enable kubelet && systemctl start kubelet
ps: 由于官网未开放同步方式, 可能会有索引gpg检查失败的情况, 这时请用 yum install -y --nogpgcheck kubelet kubeadm kubectl
安装
相关链接
官方主页:https://kubernetes.io/
二、准备工作
0、准备3台机器
每台机器内存>=2G
1、修改主机名及解析(三台节点)
# 1、修改主机名 hostnamectl set-hostname k8s-master-01 hostnamectl set-hostname k8s-node-01 hostnamectl set-hostname k8s-node-02 # 2、三台机器添加host解析 cat >> /etc/hosts << "EOF" 192.168.71.12 k8s-master-01 m1 192.168.71.13 k8s-node-01 n1 192.168.71.14 k8s-node-02 n2 EOF
2、关闭一些服务(三台节点)
# 1、关闭selinux sed -i 's#enforcing#disabled#g' /etc/selinux/config setenforce 0 # 2、禁用防火墙,网络管理,邮箱 systemctl disable --now firewalld NetworkManager postfix # 3、关闭swap分区 swapoff -a # 注释swap分区 cp /etc/fstab /etc/fstab_bak sed -i '/swap/d' /etc/fstab
3、sshd服务优化
# 1、加速访问 sed -ri 's@^#UseDNS yes@UseDNS no@g' /etc/ssh/sshd_config sed -ri 's#^GSSAPIAuthentication yes#GSSAPIAuthentication no#g' /etc/ssh/sshd_config grep ^UseDNS /etc/ssh/sshd_config grep ^GSSAPIAuthentication /etc/ssh/sshd_config systemctl restart sshd # 2、密钥登录(主机点做):为了让后续一些远程拷贝操作更方便 ssh-keygen ssh-copy-id -i root@k8s-master-01 ssh-copy-id -i root@k8s-node-01 ssh-copy-id -i root@k8s-node-02
4、增大文件打开数量(退出当前会话立即生效)
cat > /etc/security/limits.d/k8s.conf <<'EOF' * soft nofile 65535 * hard nofile 131070 EOF ulimit -Sn ulimit -Hn
5、所有节点配置模块自动加载,此步骤不做的话(kubeadm init时会直接失败!)
modprobe br_netfilter modprobe ip_conntrack cat >>/etc/rc.sysinit<<EOF #!/bin/bash for file in /etc/sysconfig/modules/*.modules ; do [ -x $file ] && $file done EOF echo "modprobe br_netfilter" >/etc/sysconfig/modules/br_netfilter.modules echo "modprobe ip_conntrack" >/etc/sysconfig/modules/ip_conntrack.modules chmod 755 /etc/sysconfig/modules/br_netfilter.modules chmod 755 /etc/sysconfig/modules/ip_conntrack.modules lsmod | grep br_netfilter
6、同步集群时间
# =====================》chrony服务端:服务端我们可以自己搭建,也可以直接用公网上的时间服务器,所以是否部署服务端看你自己 # 1、安装 yum -y install chrony # 2、修改配置文件 mv /etc/chrony.conf /etc/chrony.conf.bak cat > /etc/chrony.conf << EOF server ntp1.aliyun.com iburst minpoll 4 maxpoll 10 server ntp2.aliyun.com iburst minpoll 4 maxpoll 10 server ntp3.aliyun.com iburst minpoll 4 maxpoll 10 server ntp4.aliyun.com iburst minpoll 4 maxpoll 10 server ntp5.aliyun.com iburst minpoll 4 maxpoll 10 server ntp6.aliyun.com iburst minpoll 4 maxpoll 10 server ntp7.aliyun.com iburst minpoll 4 maxpoll 10 driftfile /var/lib/chrony/drift makestep 10 3 rtcsync allow 0.0.0.0/0 local stratum 10 keyfile /etc/chrony.keys logdir /var/log/chrony stratumweight 0.05 noclientlog logchange 0.5 EOF # 4、启动chronyd服务 systemctl restart chronyd.service # 最好重启,这样无论原来是否启动都可以重新加载配置 systemctl enable chronyd.service systemctl status chronyd.service # =====================》chrony客户端:在需要与外部同步时间的机器上安装,启动后会自动与你指定的服务端同步时间 # 下述步骤一次性粘贴到每个客户端执行即可 # 1、安装chrony yum -y install chrony # 2、需改客户端配置文件 mv /etc/chrony.conf /etc/chrony.conf.bak cat > /etc/chrony.conf << EOF server 服务端的ip地址或可解析的主机名 iburst driftfile /var/lib/chrony/drift makestep 10 3 rtcsync local stratum 10 keyfile /etc/chrony.key logdir /var/log/chrony stratumweight 0.05 noclientlog logchange 0.5 EOF # 3、启动chronyd systemctl restart chronyd.service systemctl enable chronyd.service systemctl status chronyd.service # 4、验证 chronyc sources -v
7、更新基础yum源(三台机器)
# 1、清理 rm -rf /etc/yum.repos.d/* yum remove epel-release -y rm -rf /var/cache/yum/x86_64/6/epel/ # 2、安装阿里的base与epel源 curl -s -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo curl -s -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo yum clean all yum makecache # 或者用华为的也行 # curl -o /etc/yum.repos.d/CentOS-Base.repo https://repo.huaweicloud.com/repository/conf/CentOS-7-reg.repo # yum install -y https://repo.huaweicloud.com/epel/epel-release-latest-7.noarch.rpm
8、更新系统软件(排除内核)
yum update -y --exclud=kernel*
9、安装基础常用软件
yum -y install expect wget jq psmisc vim net-tools telnet yum-utils device-mapper-persistent-data lvm2 git ntpdate chrony bind-utils rsync unzip git
10、更新内核(docker 对系统内核要求比较高,最好使用4.4+)主节点操作
wget https://elrepo.org/linux/kernel/el7/x86_64/RPMS/kernel-lt-5.4.274-1.el7.elrepo.x86_64.rpm wget https://elrepo.org/linux/kernel/el7/x86_64/RPMS/kernel-lt-devel-5.4.274-1.el7.elrepo.x86_64.rpm for i in n1 n2 m1 ; do scp kernel-lt-* $i:/opt; done 补充:如果下载的慢就从网盘里拿吧 链接:https://pan.baidu.com/s/1gVyeBQsJPZjc336E8zGjyQ 提取码:Egon
三个节点操作
#安装 yum localinstall -y /opt/kernel-lt* #调到默认启动 grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg #查看当前默认启动的内核 grubby --default-kernel #重启系统 reboot
11、三个节点安装IPVS
# 1、安装ipvsadm等相关工具 yum -y install ipvsadm ipset sysstat conntrack libseccomp # 2、配置加载 cat > /etc/sysconfig/modules/ipvs.modules <<"EOF" #!/bin/bash ipvs_modules="ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack" for kernel_module in ${ipvs_modules}; do /sbin/modinfo -F filename ${kernel_module} > /dev/null 2>&1 if [ $? -eq 0 ]; then /sbin/modprobe ${kernel_module} fi done EOF chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep ip_vs
12、三台机器修改内核参数
cat > /etc/sysctl.d/k8s.conf << EOF net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp.keepaliv.probes = 3 net.ipv4.tcp_keepalive_intvl = 15 net.ipv4.tcp.max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp.max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.top_timestamps = 0 net.core.somaxconn = 16384 EOF # 立即生效 sysctl --system
三、安装containerd(三台节点都要做)
自Kubernetes1.24以后,K8S就不再原生支持docker了
我们都知道containerd来自于docker,后被docker捐献给了云原生计算基金会(我们安装docker当然会一并安装上containerd)
1、centos7默认的libseccomp的版本为2.3.1,不满足containerd的需求,需要下载2.4以上的版本即可,我这里部署2.5.1版本。
# 1、如果你不升级libseccomp的话,启动容器会报错
**Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/ed17cbdc31099314dc8fd609d52b0dfbd6fdf772b78aa26fbc9149ab089c6807/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown**
# 2、升级
rpm -e libseccomp-2.3.1-4.el7.x86_64 --nodeps
# wget http://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/libseccomp-2.5.1-1.el8.x86_64.rpm
wget https://mirrors.aliyun.com/centos/8/BaseOS/x86_64/os/Packages/libseccomp-2.5.1-1.el8.x86_64.rpm
rpm -ivh libseccomp-2.5.1-1.el8.x86_64.rpm # 官网已经gg了,不更新了,请用阿里云
rpm -qa | grep libseccomp
安装方式一:( 基于阿里云的源)推荐用这种方式,安装的是
# 1、卸载之前的 yum remove docker docker-ce containerd docker-common docker-selinux docker-engine -y # 2、准备repo cd /etc/yum.repos.d/ wget http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo # 3、安装 yum install containerd* -y
了解—>安装方式二:或者去官网找最新的包(新版1.7.18包内缺少runc容器运行时,需要单独安,旧版1.6.4是包含runc的,不需要额外安装)
# 1、下载containerd:https://github.com/containerd/containerd/releases/
wget https://github.com/containerd/containerd/releases/download/v1.7.18/containerd-1.7.18-linux-amd64.tar.gz
# 国内下载地址:https://gitee.com/egonlin/containerd-1.7.18
# 注意新版1.7.18包内缺少runc容器运行时,需要单独安装
# 而wget https://github.com/containerd/containerd/releases/download/v1.6.4/cri-containerd-cni-1.6.4-linux-amd64.tar.gz该版本中包含了 containerd以及cri runc等相关工具包
# 2、解压即可
tar zxvf containerd-1.7.18-linux-amd64.tar.gz -C /usr # 命令都会解压到/usr/bin下,可以直接用都不用处理PATH变量
# 3、需要自己添加系统服务
cat > /usr/lib/systemd/system/containerd.service << "EOF"
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF
# 3、
systemctl daemon-reload
# 4、注意1.7.18包内缺少runc容器运行时,需要单独安装,详见附录
配置
# 1、配置
mkdir -pv /etc/containerd
containerd config default > /etc/containerd/config.toml # 为containerd生成配置文件
# 2、替换默认pause镜像地址: 这一步非常非常非常非常重要
# 这一步非常非常非常非常重要,国内的镜像地址可能导致下载失败,最红kubeadm安装失败!!!!!!!!!!!!!!
# 这一步非常非常非常非常重要,国内的镜像地址可能导致下载失败,最红kubeadm安装失败!!!!!!!!!!!!!!
# 这一步非常非常非常非常重要,国内的镜像地址可能导致下载失败,最红kubeadm安装失败!!!!!!!!!!!!!!
grep sandbox_image /etc/containerd/config.toml
sed -i 's/registry.k8s.io/registry.cn-hangzhou.aliyuncs.com\/google_containers/' /etc/containerd/config.toml
grep sandbox_image /etc/containerd/config.toml
# 请务必确认新地址是可用的:sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6"
# 3、配置systemd作为容器的cgroup driver
grep SystemdCgroup /etc/containerd/config.toml
sed -i 's/SystemdCgroup \= false/SystemdCgroup \= true/' /etc/containerd/config.toml
grep SystemdCgroup /etc/containerd/config.toml
# 4、配置加速器(必须配置,否则后续安装cni网络插件时无法从docker.io里下载镜像)
#参考:https://github.com/containerd/containerd/blob/main/docs/cri/config.md#registry-configuration
#添加 config_path = "/etc/containerd/certs.d"
sed -i 's/config_path\ =.*/config_path = \"\/etc\/containerd\/certs.d\"/g' /etc/containerd/config.toml
mkdir -p /etc/containerd/certs.d/docker.io cat > /etc/containerd/certs.d/docker.io/hosts.toml << EOF server = "https://docker.io" [host."https://dockerproxy.com"] capabilities = ["pull", "resolve"] [host."https://docker.m.daocloud.io"] capabilities = ["pull", "resolve"] [host."https://reg-mirror.qiniu.com"] capabilities = ["pull", "resolve"] [host."https://registry.docker-cn.com"] capabilities = ["pull", "resolve"] [host."http://hub-mirror.c.163.com"] capabilities = ["pull", "resolve"] EOF
# 5、配置containerd开机自启动
# 5.1 启动containerd服务并配置开机自启动
systemctl daemon-reload && systemctl restart containerd
systemctl enable --now containerd
# 5.2 查看containerd状态
systemctl status containerd
# 5.3 查看containerd的版本
ctr version
-------------------------配置docker(下述内容不用操作,因为k8s1.30直接对接containerd) # 1、配置docker # 修改配置:驱动与kubelet保持一致,否则会后期无法启动kubelet cat > /etc/docker/daemon.json << "EOF" { "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors":["https://reg-mirror.qiniu.com/"] } EOF # 2、重启docker systemctl restart docker.service systemctl enable docker.service # 3、查看验证 [root@k8s-master-01 ~]# docker info |grep -i cgroup Cgroup Driver: systemd Cgroup Version: 1
四、安装k8s
官网:https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-init/
1、三台机器准备k8s源
cat > /etc/yum.repos.d/kubernetes.repo << "EOF" [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/rpm/ enabled=1 gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/rpm/repodata/repomd.xml.key EOF # 参考:https://developer.aliyun.com/mirror/kubernetes/ setenforce 0 yum install -y kubelet-1.30* kubeadm-1.30* kubectl-1.30* systemctl enable kubelet && systemctl start kubelet && systemctl status kubelet
2、主节点操作(node节点不执行)
初始化master节点(仅在master节点上执行): # 可以kubeadm config images list查看 [root@k8s-master-01 ~]# kubeadm config images list registry.k8s.io/kube-apiserver:v1.30.0 registry.k8s.io/kube-controller-manager:v1.30.0 registry.k8s.io/kube-scheduler:v1.30.0 registry.k8s.io/kube-proxy:v1.30.0 registry.k8s.io/coredns/coredns:v1.11.1 registry.k8s.io/pause:3.9 registry.k8s.io/etcd:3.5.12-0
部署方法一:先生成配置文件,编辑修改后,再部署(推荐,因为高级配置只能通过配置文件指定,方案二直接用kubeadm init则无法指定,例如配置使用ipvs模式)
kubeadm config print init-defaults > kubeadm.yaml # 先生成配置文件,内容及修改如下
apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.71.12 #控制节点的ip bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/containerd/containerd.sock #指定containerd容器运行时 imagePullPolicy: IfNotPresent name: k8s-master-01 taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: local: dataDir: /var/lib/etcd imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers # 换成阿里云镜像仓库地址 kind: ClusterConfiguration kubernetesVersion: 1.30.0 # 指定k8s版本 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 # 指定Service网段 podSubnet: 10.244.0.0/16 # 增加一行,指定pod网段 scheduler: {} #在文件最后,插入以下内容,(复制时,要带着---): --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs # 表示kube-proxy代理模式是ipvs,如果不指定ipvs,会默认使用iptables,但是iptables效率低,所以我们生产环境建议开启ipvs,阿里云和华为云托管的K8s,也提供ipvs模式 --- apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration cgroupDriver: systemd
部署
[root@k8s-master-01 ~]# kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification --ignore-preflight-errors=Swap
部署方案二:直接命令行敲命令(命令行不能指定用什么模式,只能用默认为iptables模式)
# 初始化 kubeadm init \ --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers \ --kubernetes-version=v1.30.0 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 # 也可以使用 --image-repository=registry.cn-hangzhou.aliyuncs.com/k8sos # 老版本的可以用, 新版本不行 # 可选项:--apiserver-advertise-address=192.168.71.12 # 如果是高可用部署,那该地址指向vip地址即可
结果
。。。。。。。。。。。。。。。。。。。。。。。 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.71.12:6443 --token 9hovhy.vxm1l7zs16zr53ve \ --discovery-token-ca-cert-hash sha256:3b210d53b7f26a43ccf251cfb9f809f280048ab70bf5c1458c69586ed0eb9905
查看node: 最开始时NotReady状态正常,因为网络组件没有部署ok
[root@k8s-master-01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-01 NotReady control-plane 4m26s v1.30.0 [root@k8s-master-01 ~]# kubectl -n kube-system get pods NAME READY STATUS RESTARTS AGE coredns-7c445c467-mfls7 0/1 Pending 0 6m30s coredns-7c445c467-zvkkw 0/1 Pending 0 6m30s etcd-k8s-master-01 1/1 Running 0 6m44s kube-apiserver-k8s-master-01 1/1 Running 0 6m44s kube-controller-manager-k8s-master-01 1/1 Running 0 6m44s kube-proxy-jhxrd 1/1 Running 0 109s kube-proxy-nh7tj 1/1 Running 0 33s kube-proxy-q92mx 1/1 Running 0 6m30s kube-scheduler-k8s-master-01 1/1 Running 0 6m44s
3、加入node节点
去另外两个node节点上执行
kubeadm join 192.168.71.12:6443 --token 9hovhy.vxm1l7zs16zr53ve \ --discovery-token-ca-cert-hash sha256:3b210d53b7f26a43ccf251cfb9f809f280048ab70bf5c1458c69586ed0eb9905
4、部署网络插件
下载网络插件
wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml [root@master01 flannel]# vim kube-flannel.yml apiVersion: v1 data: ... net-conf.json: | { "Network": "10.244.0.0/16", # 与--pod-network-cidr保持一致 "Backend": { "Type": "vxlan" } }
[root@k8s-master-01 ~]# cat kube-flannel.yml apiVersion: v1 kind: Namespace metadata: labels: k8s-app: flannel pod-security.kubernetes.io/enforce: privileged name: kube-flannel --- apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: flannel name: flannel namespace: kube-flannel --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: flannel name: flannel rules: - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - get - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch - apiGroups: - networking.k8s.io resources: - clustercidrs verbs: - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: flannel name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-flannel --- apiVersion: v1 data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } } kind: ConfigMap metadata: labels: app: flannel k8s-app: flannel tier: node name: kube-flannel-cfg namespace: kube-flannel --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: flannel k8s-app: flannel tier: node name: kube-flannel-ds namespace: kube-flannel spec: selector: matchLabels: app: flannel k8s-app: flannel template: metadata: labels: app: flannel k8s-app: flannel tier: node spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux containers: - args: - --ip-masq - --kube-subnet-mgr command: - /opt/bin/flanneld env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: EVENT_QUEUE_DEPTH value: "5000" image: docker.io/flannel/flannel:v0.25.1 name: kube-flannel resources: requests: cpu: 100m memory: 50Mi securityContext: capabilities: add: - NET_ADMIN - NET_RAW privileged: false volumeMounts: - mountPath: /run/flannel name: run - mountPath: /etc/kube-flannel/ name: flannel-cfg - mountPath: /run/xtables.lock name: xtables-lock hostNetwork: true initContainers: - args: - -f - /flannel - /opt/cni/bin/flannel command: - cp image: docker.io/flannel/flannel-cni-plugin:v1.4.0-flannel1 name: install-cni-plugin volumeMounts: - mountPath: /opt/cni/bin name: cni-plugin - args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist command: - cp image: docker.io/flannel/flannel:v0.25.1 name: install-cni volumeMounts: - mountPath: /etc/cni/net.d name: cni - mountPath: /etc/kube-flannel/ name: flannel-cfg priorityClassName: system-node-critical serviceAccountName: flannel tolerations: - effect: NoSchedule operator: Exists volumes: - hostPath: path: /run/flannel name: run - hostPath: path: /opt/cni/bin name: cni-plugin - hostPath: path: /etc/cni/net.d name: cni - configMap: name: kube-flannel-cfg name: flannel-cfg - hostPath: path: /run/xtables.lock type: FileOrCreate name: xtables-lock
部署(在master01上即可)
kubectl -n kube-flannel get pods kubectl -n kube-flannel get pods -w [root@k8s-master-01 ~]# kubectl get nodes # 全部ready [root@k8s-master-01 ~]# kubectl -n kube-system get pods # 两个coredns的pod也都ready
5、部署kubectl命令提示(在所有节点执行)
yum install bash-completion* -y kubectl completion bash > ~/.kube/completion.bash.inc echo "source '$HOME/.kube/completion.bash.inc'" >> $HOME/.bash_profile source $HOME/.bash_profile
6、其他
(1)缺少默认路由
route add default gw xxx.xxx.xxx.xxx dev 网卡名
(2)提示warning信息说proxy代理问题,会影响安装,在scheduler等组件里会报Forbidden错误,去掉该warnning信息如下
# 步骤1 vim /etc/profile export no_proxy=127.0.0.1,本机ip地址 # 步骤2 source /etc/profile
(3)kubeadm默认镜像拉取地址为k8s.gcr.io,请使用指定仓库
(4)驱动不匹配问题,systemd
步骤1、修改docker配置,并重复服务
编辑/etc/docker/daemon.json新增配置,重启docker { ...... "exec-opts": ["native.cgroupdriver=systemd"], ...... } 注意!!!:daemon.json中live-restore: true的情况下才可以重启docker,否则会导致容器挂掉
步骤2、修改kubelet配置
[root@jsswx191 ~]# vi /var/lib/kubelet/config.yaml ...... cgroupDriver: systemd [root@jsswx191 ~]# vi /var/lib/kubelet/kubeadm-flags.env KUBELET_KUBEADM_ARGS="--cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1"
步骤3、重复docker与kubelet
systemctl daemon-reload systemctl restart docker systemctl restart kubelet
步骤4、检查swap是否关闭
swapoff -a
步骤5、最后、检查
[root@jsswx191 ~]# docker info|grep "Cgroup Driver" 是否输出 Cgroup Driver: systemd [root@xxx ~]# ps aux |grep /usr/bin/kubelet |grep -v grep root 581806 17.6 0.0 5633952 131056 ? Ssl 14:27 9:05 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --fail-swap-on=false
(5)kubeadm init拉起集群超时,pause容器不停地created,有可能docker的host网络缺失问题,排错方法
docker container ls -a |grep pause docker inspect 容器ID 如果发现docker的host网络缺失问题,导致pause容器创建失败,后续的组件容器都拉不起来,需要手动重建docker网络
(6)kubectl命令不可用
cat /etc/profile ls /etc/profile.d/ 下有一系列脚本文件,可能对kubectl做了别名,解决方式见图
(7)kubeadm init创建完集群后,有pod一直是pending状态
kubectl describe pod如果发现问题 3 node(s) had taints that the pod didn't tolerate. kubernetes出于安全考虑默认情况下无法在master节点上部署pod,于是用下面方法去掉master节点的污点: kubectl taint nodes --all node-role.kubernetes.io/master-
(8)etcd容器挂掉,进而导致apiserver挂掉:建议先docker inspect 容器ID查看,一个可能的原因是
apiserver监听地址不对,请检查/etc/hosts文件中配置的地址是否可以ping通,然后kubeadm reset -f,重新kubeadm init ...
(9)硬盘资源不够用
systemctl status kubelet # 报错:must evict pod(s) to reclaim ephemeral-storage df -Th查看一下剩余磁盘容量 如果/etc/docker/daemon.json内配置了live-restore: true,则可以放心进行下述步骤,解决完毕后重新kubeadm init...,否则重启docker会挂掉容器,此时则需要与团队同事沟通是否可以后才可进行操作 1、制作新盘挂载到/data目录,注意文件系统格式与docker原镜像目录/var/lib/docker保持一致 2、systemctl stop docker 3、mv /var/lib/docker /data/ 4、ln -s /data/docker /var/lib/docker 5、systemctl start docker
(10)报错,提示端口范围问题
# kubeadm 更改NodePort端口范围 kubernetes默认端口号范围是 30000-32767 ,如果期望值不是这个区间则需要更改。 1、找到配置文件里,一般的在这个文件夹下: /etc/kubernetes/manifests/ 2、找到文件名为kube-apiserver.yaml 的文件,也可能是json格式 3、编辑添加配置 service-node-port-range=1024-65535,如下图所示
(11)kubectl get pods时报错,指向一个未知的ip地址
# 报错信息如下 [xxx@xxx ~]$ kubectl get pods The connection to the server 172.111.66.53:6443 was refused - did you specify the right host or port? # 可能是系统环境被设置过环境变量,可以查看一下 env |grep -i proxy # 然后 unset http_proxy unset https_proxy # 最后找到环境变量配置的地方修改 vi /etc/profile ... source /etc/profile
(12)排错命令汇总:
systemctl status kubelet systemctl status docker docker container ls | grep k8s各个组件的容器 # 每个容器都搭配一个pause容器 tail -f /var/log/messages 查看docker\kubelet服务日志 查看所有日志:journalctl -u docker --no-pager 查看最近200条日志【分页】:journalctl -u docker -n 200 查看最近200条日志【不分页】:journalctl -u docker -n 200 --no-pager docker container ls -a | grep pause docker inspect 容器id docker logs 容器id
(13)清理集群,然后重新kubeadm init …
kubeadm reset -f rm -rf ~/.kube/ rm -rf /etc/kubernetes/ rm -rf /etc/cni rm -rf /opt/cni rm -rf /var/lib/etcd rm -rf /var/etcd 选做项 yum clean all yum remove kube* rm -rf /etc/systemd/system/kubelet.service.d rm -rf /etc/systemd/system/kubelet.service rm -rf /usr/bin/kube*
五、铲掉k8s环境重新部署
遇到部署错误,可以铲掉整个k8s环境然后重新部署(本文第三章节前的操作不清理,下述指令清理的知识第四章节的内容)
# ==============================》铲掉 # 在master节点上 kubeadm reset -f # 在所有节点包括master节点在内上执行如下命令 cd /tmp # 有时候在当前目录下可能与要卸载的包重名的而导致卸载报错,可以切个目录 rm -rf ~/.kube/ rm -rf /etc/kubernetes/ rm -rf /etc/cni rm -rf /opt/cni rm -rf /var/lib/etcd rm -rf /var/etcd rm -rf /run/flannel rm -rf /opt/cni rm -rf /etc/cni/net.d rm -rf /run/xtables.lock systemctl stop kubelet yum remove kube* -y for i in `df |grep kubelet |awk '{print $NF}'`;do umount -l $i ;done # 先卸载所有kubelet挂载否则下条命令无法删除 rm -rf /var/lib/kubelet rm -rf /etc/systemd/system/kubelet.service.d rm -rf /etc/systemd/system/kubelet.service rm -rf /usr/bin/kube* iptables -F reboot # 重新启动,从头再来 # ==============================》然后重新部署 # 第一步:在所有节点执行 yum install -y kubelet-1.30* kubeadm-1.30* kubectl-1.30* systemctl enable kubelet && systemctl start kubelet && systemctl status kubelet # 第二步:只在master节点上执行 [root@k8s-master-01 ~]# kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification --ignore-preflight-errors=Swap # 第三步:部署网络插件 kubectl apply -f kube-flannel.yml