一、参数调整

kube-apiserver 以下两个参数可以控制连接数:

--max-mutating-requests-inflight int
    The maximum number of mutating requests in flight at a given time.
    When the server exceeds this, it rejects requests.
    Zero for no limit. (default 200)

--max-requests-inflight int
    The maximum number of non - mutating requests in flight at a given time.
    When the server exceeds this, it rejects requests.
    Zero for no limit. (default 400)

节点数量在 1000 – 3000 之间时，推荐：

--max-requests-inflight=1500
--max-mutating-requests-inflight=500

节点数量大于 3000 时，推荐

--max-requests-inflight=3000
--max-mutating-requests-inflight=1000

当集群中 node 以及 pod 数量非常多时可以稍微调大：

--watch-cache-sizes：调大 resources 的 watch size，默认为 100，比如：--watch-cache-sizes=node#1000, pod#5000

二、apiserver的负载均衡
方式一：启动多个 kube-apiserver 实例通过外部 LB 做负载均衡
方式二：设置 –apiserver-count 和 –endpoint-reconciler-type ，让多个kube-apiserver 实例加入到 Kubernetes Service 的 endpoints 中，从而实现高可用。

二、使用pprof进行性能分析

pprof 是 golang 的一大杀器，要想进行源码级别的性能分析，必须使用 pprof。

// 安装相关包
$ brew install graphviz

// 启动 pprof
$ go tool pprof http://localhost:8001/debug/pprof/profile
File: kube-apiserver
Type: cpu
Time: Oct 11, 2019 at 11:39am (CST)
Duration: 30s, Total samples = 620ms ( 2.07%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) web // 使用 web 命令生成 svg 文件

可以通过 graph 以及交互式界面得到 cpu 耗时、goroutine 阻塞等信息，apiserver 中的对象比较多，序列化会消耗非常大的时间。

文章版权归作者所有，未经允许请勿转载。

THE END