一、亲和力配置详解¶
1.1 节点亲和力配置详解¶
1.yaml文件展示
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- az-2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
上述配置的Pod只能部署在具有label的key为kubernetes.io/e2e-az-name、value为e2e-az1或az-2的节点上,但是因为配置了软亲和力,所以在满足上述条件时,会尽量部署在具有another-node-label-key= another-node-label-value的节点上。当然这个条件不是强制要求的,没有该标签的Node不会影响部署。
文件参数说明如下: (1)requiredDuringSchedulingIgnoredDuringExecution:硬亲和力配置
- nodeSelectorTerms:节点选择器配置,可以配置多个matchExpressions(满足其一),每个 matchExpressions下可以配置多个key、value类型的选择器(都需要满足),其中values可以配置多个 (满足其一)
(2)preferredDuringSchedulingIgnoredDuringExecution:软亲和力配置
-
weight:软亲和力的权重,权重越高优先级越大,范围1-100
-
preference:软亲和力配置项,和weight同级,可以配置多个,matchExpressions和硬亲和力一致
(3)operator:标签匹配的方式
-
In:相当于key = value的形式
-
NotIn:相当于key != value的形式
-
Exists:节点存在label的key为指定的值即可,不能配置values字段
-
DoesNotExist:节点不存在label的key为指定的值即可,不能配置values字段
-
Gt:大于value指定的值
-
Lt:小于value指定的值
注意:如果同时指定了NodeSelector和NodeAffinity,需要两者都满足才能被调度;如果配置了多个NodeSelectorTerms,满足其一即可调度到指定的节点上;如果配置了多个MatchExpressions,需要全部满足才能调度到指定的节点上;如果删除了被调度节点上的标签,Pod不会被删除,也就是说亲和力配置只有在调度的时候才会起作用。
1.2 Pod亲和力和反亲和力配置详解¶
NodeAffinity是根据节点上的标签选择性地调度,可以让Pod部署在指定标签的节点上,或者不部署在指定标签的节点上,调度时是根据节点上的标签进行选择的。而Pod亲和力和反亲和力是根据其他Pod的标签进行匹配的,比如想要A服务的Pod不能和具有service=b标签的Pod部署在同一个节点上,此时可以使用Pod亲和力和反亲和力进行配置,该调度是基于Pod的标签进行选择的。
1.yaml文件展示
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: failure-domain.beta.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
namespaces:
- default
topologyKey: failure-domain.beta.kubernetes.io/zone
containers:
- name: with-pod-affinity
image: nginx
文件参数说明如下:
- labelSelector:Pod选择器配置,可以配置多个
- matchExpressions:和节点亲和力配置一致
- operator:配置和节点亲和力一致,但是没有Gt和Lt
- topologyKey:匹配的拓扑域的key,也就是节点上label的key,key和 value相同的为同一个域,可以用于标注不同的机房和地区
- Namespaces: 和哪个命名空间的Pod进行匹配,为空为当前命名空间
上述Pod亲和力配置的是硬亲和力,也就是必须和具有security为S1标签的Pod部署在同一个域内,域的名称为failure-domain.beta.kubernetes.io/zone。但同时又配置了Pod反亲和力,尽量不和具有security为S2标签的Pod部署在同一个域内,所以该Pod要求在failure-domain.beta.kubernetes.io/zone的域内,要和security为S1部署在一起,尽量不和security为S2部署在一起。
由于Pod的标签是Pod本身的配置,因此它和Pod一样具有隔离性,也就是说在配置Pod亲和力和反亲和力时可以匹配其他Namespace下的Pod的标签。
注意:由于在使用Pod亲和力和Pod反亲和力时,需要进行大量的计算,会降低大规模集群下的调度速率,因此在集群节点超过数百时,并不建议使用过多的Pod亲和力配置。
二、拓扑域TopologyKey详解¶
拓扑域,主要针对宿主机,相当于对宿主机进行区域的划分。用label进行判断,不同的key和不同的value是属于不同的拓扑域.
下面演示同一个应用多区域部署:
1.根据标签区分区域,其中k8s-master01和k8s-master02为大兴区,k8s-master03和k8s-node01为朝阳区,k8s-node02为XX区
[root@k8s-master01 study]# kubectl label node k8s-master01 k8s-master02 region=daxing
[root@k8s-master01 study]# kubectl label node k8s-master03 k8s-node01 region=chaoyang
[root@k8s-master01 study]# kubectl label node k8s-node02 region=xx
2.查看标签是否已经打上
[root@k8s-master01 study]# kubectl get node -lregion --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-master01 Ready control-plane,master 8d v1.23.14 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master01,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,region=daxing,ssd=true
k8s-master02 Ready control-plane,master 8d v1.23.14 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master02,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,region=daxing
k8s-master03 Ready control-plane,master 8d v1.23.14 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master03,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,region=chaoyang
k8s-node01 Ready <none> 8d v1.23.14 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,region=chaoyang
k8s-node02 Ready <none> 8d v1.23.14 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node02,kubernetes.io/os=linux,region=xx,type=physical
3.定义一个yaml文件
[root@k8s-master01 study]# vim podAntiAffinity03.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: must-be-diff-zone
name: must-be-diff-zone
namespace: kube-public
spec:
replicas: 3
selector:
matchLabels:
app: must-be-diff-zone
template:
metadata:
labels:
app: must-be-diff-zone
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- must-be-diff-zone
topologyKey: region
containers:
- image: registry.cn-hangzhou.aliyuncs.com/zq-demo/nginx:1.14.2
imagePullPolicy: IfNotPresent
name: must-be-diff-zone
4.开始部署
[root@k8s-master01 study]# kubectl create -f podAntiAffinity03.yaml
5.查看pod节点状态
[root@k8s-master01 study]# kubectl get po -n kube-public -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
must-be-diff-zone-79dfd48799-2nlz5 1/1 Running 0 44s 172.17.125.4 k8s-node01 <none> <none>
must-be-diff-zone-79dfd48799-bbkk2 0/1 Pending 0 44s <none> <none> <none> <none>
must-be-diff-zone-79dfd48799-qf4vg 1/1 Running 0 44s 172.27.14.199 k8s-node02 <none> <none>
6.查看污点
[root@k8s-master01 study]# kubectl describe node | grep Taint
Taints: node-role.kubernetes.io/master:NoSchedule
Taints: node-role.kubernetes.io/master:NoSchedule
Taints: node-role.kubernetes.io/master:NoSchedule
Taints: <none>
Taints: <none>
7.删除污点
[root@k8s-master01 study]# kubectl taint node -l node-role.kubernetes.io/master node-role.kubernetes.io/master:NoSchedule-
node/k8s-master01 untainted
node/k8s-master02 untainted
node/k8s-master03 untainted
[root@k8s-master01 study]# kubectl describe node | grep Taint Taints: <none>
Taints: <none>
Taints: <none>
Taints: <none>
Taints: <none>
8.再次查看pod节点状态,观察到同一个应用多区域部署
[root@k8s-master01 study]# kubectl get po -n kube-public -owideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
must-be-diff-zone-79dfd48799-2nlz5 1/1 Running 0 37m 172.17.125.4 k8s-node01 <none> <none>
must-be-diff-zone-79dfd48799-bbkk2 1/1 Running 0 37m 172.25.244.193 k8s-master01 <none> <none>
must-be-diff-zone-79dfd48799-qf4vg 1/1 Running 0 37m 172.27.14.199 k8s-node02 <none> <none>