Kubernetes 集群top查看资源报错解决方案

报错信息

$ kubectl top node
W0129 20:33:30.417876   22198 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
error: Metrics API not available

如上看到 error 的报错信息为 Metrics API not available, 这个是由于该 Kubernetes 环境没有安装 metric-server 组件导致的。

安装 metric-server

官方文档:https://github.com/kubernetes-sigs/metrics-server

安装说明文档: https://github.com/kubernetes-sigs/metrics-server#requirements

# 下载yaml配置文件
$ wget -O metrics-server.yaml https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.0/components.yaml
# 修改 metrics-server.yaml 中的镜像地址,官方文档的镜像地址是境外
$ vim metrics-server.yaml
---
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.0

注意: kubelet 证书需要有集群证书颁发机构签名或者禁止证书验证,使用启动参数 --kubelet-insecure-tls 传递给 Mertics Server

$ kubectl apply -f metrics-server.yaml 
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

$ kubectl get pod -n kube-system  grep metrics
metrics-server-694b645fc9-w5f98                1/1     Running   0          2m11s

# 查看metrics-server的pod 日志
$ kubectl logs -f $(kubectl get pod -n kube-system  grep metrics awk '{print $1}') -n kube-system 
I0129 13:22:27.579559       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0129 13:22:28.357723       1 secure_serving.go:266] Serving securely on [::]:4443
I0129 13:22:28.358219       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0129 13:22:28.358255       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0129 13:22:28.358540       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
W0129 13:22:28.359040       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0129 13:22:28.359351       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0129 13:22:28.359990       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0129 13:22:28.360029       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0129 13:22:28.360417       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0129 13:22:28.360453       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0129 13:22:28.459221       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I0129 13:22:28.460822       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0129 13:22:28.460828       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file

验证查看top信息

$ kubectl top node
W0129 21:27:18.391044   19223 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
master   349m         21%    1427Mi          102%      
node1    126m         7%     981Mi           70%       
node2    127m         21%    772Mi           55% 

$ kubectl top pod -n kube-system --sort-by=cpu # 查看kube-system命名空间下所有pod的使用情况,并以 cpu 使用大小降序排序
W0129 21:27:50.043323   20459 top_pod.go:140] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME                                           CPU(cores)   MEMORY(bytes)   
kube-apiserver-master                          83m          637Mi           
calico-node-vzttj                              35m          115Mi           
calico-node-9r9n7                              28m          119Mi           
calico-node-nt7zm                              28m          119Mi           
kube-controller-manager-master                 20m          82Mi            
kube-proxy-k625g                               18m          25Mi            
kube-proxy-cplkm                               9m           35Mi            
kube-proxy-xkm7h                               5m           20Mi            
nodelocaldns-vwkv4                             4m           11Mi            
calico-kube-controllers-75ddb95444-jzl54       3m           17Mi            
coredns-5495dd7c88-st4wl                       3m           10Mi            
kube-scheduler-master                          3m           31Mi            
metrics-server-694b645fc9-w5f98                3m           14Mi            
coredns-5495dd7c88-j6x94                       3m           10Mi            
nodelocaldns-fdxhv                             3m           24Mi            
nodelocaldns-78bw5                             2m           15Mi            
openebs-localpv-provisioner-6c9dcb5c54-wdqd2   2m           11Mi  

Kubernetes 集群top查看资源报错解决方案
http://www.qiqios.cn/2022/01/29/kubernetes-集群top查看资源报错解决方案/
作者
一亩三分地
发布于
2022年1月29日
许可协议