k8s hpa can't get the cpu information

14,677

Solution 1

With Kubernetes 1.18 and Metrics v0.3.7 we should edit the metrics-server deployment to reflect the following argument:

args:
  - --kubelet-insecure-tls
  - --kubelet-preferred-address-types=InternalIP
  - --cert-dir=/tmp
  - --secure-port=4443

Solution 2

Thank you weibeld and EAT_Py. I have resolved this problem. The debug process:

sudo kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
sudo kubectl -n kube-system logs metrics-server-795b774c76-t2rj7
sudo kubectl top node nandoc-94   -->can't get info
sudo kubectl top pod k8s-pod-e7-build-32-7bb5bc7c6-s2zsr   -->can't get info

the logs of metrics-server has some error info:

kubelet_summary:nandoc-93: unable to fetch metrics from Kubelet nandoc-93 (nandoc-93): Get https://nandoc-93:10250/stats/summary?only_cpu_and_memory=true: x509: certificate signed by unknown authority]

Then according to https://github.com/kubernetes-sigs/metrics-server/issues/146 I edit metrics-server/deploy/1.8+/metrics-server-deployment.yaml and add the command

  - name: metrics-server
    image: k8s.gcr.io/metrics-server-amd64:v0.3.6
    command:
    - /metrics-server
    - --kubelet-insecure-tls

kubectl apply -f metrics-server-deployment.yaml

After that, kubectl top pod work ok.And hpa works now. Thank you again.

sudo kubectl get hpa --all-namespaces
NAMESPACE   NAME              REFERENCE                        TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
k8s-demo    hpa-e7-build-32   Deployment/k8s-pod-e7-build-32   0%/10%    1         2         1          19h
k8s-demo    hpa-e7-build-64   Deployment/k8s-pod-e7-build-64   0%/10%    1         2         1          19h
Share:
14,677
clara
Author by

clara

Updated on August 17, 2022

Comments

  • clara
    clara over 1 year

    I set a hpa use command

    sudo kubectl autoscale deployment e7-build-64 --cpu-percent=50 --min=1 --max=2 -n k8s-demo
    

    sudo kubectl get hpa -n k8s-demo

    NAME              REFERENCE                TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
    e7-build-64       Deployment/e7-build-64   <unknown>/50%   1         2         1          15m
    

    sudo kubectl describe hpa e7-build-64 -n k8s-demo

    Name:                                                  e7-build-64
    Namespace:                                             k8s-demo
    Labels:                                                <none>
    Annotations:                                           <none>
    CreationTimestamp:                                     Tue, 10 Dec 2019 15:34:24 +0800
    Reference:                                             Deployment/e7-build-64
    Metrics:                                               ( current / target )
      resource cpu on pods  (as a percentage of request):  <unknown> / 50%
    Min replicas:                                          1
    Max replicas:                                          2
    Deployment pods:                                       1 current / 0 desired
    Conditions:
      Type           Status  Reason                   Message
      ----           ------  ------                   -------
      AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
      ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
    Events:
      Type     Reason                        Age                 From                       Message
      ----     ------                        ----                ----                       -------
      Warning  FailedComputeMetricsReplicas  13m (x12 over 16m)  horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
      Warning  FailedGetResourceMetric       74s (x61 over 16m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
    

    in the deployment.yaml I had add the resource request and limited

    resources:
      limits:
        memory: "16Gi"
        cpu: "4000m"
      requests: 
        memory: "4Gi"
        cpu: "2000m"
    

    kubectl version

    Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
    

    Then I try to set the hpa use a yaml

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
      name: hpa-e7-build-64
      namespace: k8s-demo
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: e7-build-64
      minReplicas: 1
      maxReplicas: 2
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 10
    

    it still has some error sudo kubectl describe hpa hpa-e7-build-64 -n k8s-demo

    Name:                                                  hpa-e7-build-64
    Namespace:                                             k8s-demo
    Labels:                                                <none>
    Annotations:                                           kubectl.kubernetes.io/last-applied-configuration:
                                                             {"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-e7-build-64","namespace":"k8...
    CreationTimestamp:                                     Tue, 10 Dec 2019 14:24:07 +0800
    Reference:                                             Deployment/e7-build-64
    Metrics:                                               ( current / target )
      resource cpu on pods  (as a percentage of request):  <unknown> / 10%
    Min replicas:                                          1
    Max replicas:                                          2
    Deployment pods:                                       1 current / 0 desired
    Conditions:
      Type           Status  Reason                   Message
      ----           ------  ------                   -------
      AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
      ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
    Events:
      Type     Reason                        Age                    From                       Message
      ----     ------                        ----                   ----                       -------
      Warning  FailedGetResourceMetric       59m (x141 over 94m)    horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
      Warning  FailedGetResourceMetric       54m (x2 over 54m)      horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
      Warning  FailedComputeMetricsReplicas  39m (x58 over 53m)     horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
      Warning  FailedGetResourceMetric       4m29s (x197 over 53m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
    

    And I have executed the follow commands:

    git clone https://github.com/kubernetes-incubator/metrics-server.git (fetch)
    cd metrics-server/deploy
    sudo kubectl create -f 1.8+/
    

    Does anyone know how to resolve it?

    UPDATE:

    sudo kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
    {"kind":"PodMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"},"items":[]}
    

    AND the pod information:

    sudo kubectl describe pod metrics-server-795b774c76-fs8hw -n kube-system
    Name:         metrics-server-795b774c76-fs8hw
    Namespace:    kube-system
    Priority:     0
    Node:         nandoc-95/192.168.33.225
    Start Time:   Tue, 10 Dec 2019 15:04:14 +0800
    Labels:       k8s-app=metrics-server
                  pod-template-hash=795b774c76
    Annotations:  cni.projectcalico.org/podIP: 10.0.229.135/32
    Status:       Running
    IP:           10.0.229.135
    IPs:
      IP:           10.0.229.135
    Controlled By:  ReplicaSet/metrics-server-795b774c76
    Containers:
      metrics-server:
        Container ID:  docker://2c6dd8c50938bc9ab536c78b73773aa7a9eedd60a6974805beec58e8ee9fde3c
        Image:         k8s.gcr.io/metrics-server-amd64:v0.3.6
        Image ID:      docker-pullable://k8s.gcr.io/metrics-server-amd64@sha256:c9c4e95068b51d6b33a9dccc61875df07dc650abbf4ac1a19d58b4628f89288b
        Port:          4443/TCP
        Host Port:     0/TCP
        Args:
          --cert-dir=/tmp
          --secure-port=4443
        State:          Running
          Started:      Tue, 10 Dec 2019 15:05:13 +0800
        Ready:          True
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /tmp from tmp-dir (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-xjgpx (ro)
    Conditions:
      Type              Status
      Initialized       True 
      Ready             True 
      ContainersReady   True 
      PodScheduled      True 
    Volumes:
      tmp-dir:
        Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
        Medium:     
        SizeLimit:  <unset>
      metrics-server-token-xjgpx:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  metrics-server-token-xjgpx
        Optional:    false
    QoS Class:       BestEffort
    Node-Selectors:  beta.kubernetes.io/os=linux
    Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                     node.kubernetes.io/unreachable:NoExecute for 300s
    Events:          <none>
    

    sudo kubectl get pods --all-namespaces -o wide

    NAMESPACE              NAME                                         READY   STATUS    RESTARTS   AGE    IP               NODE              NOMINATED NODE   READINESS GATES
    k8s-demo               k8s-pod-e7-build-32-7bb5bc7c6-s2zsr          1/1     Running   0          32m    10.0.100.198     nandoc-94         <none>           <none>
    k8s-demo               k8s-pod-e7-build-64-d5c659d6b-5hv6m          1/1     Running   0          31m    10.0.229.137     nandoc-95         <none>           <none>
    kube-system            calico-kube-controllers-55754f75c-82np8      1/1     Running   0          5d     10.0.126.1       nandoc-93         <none>           <none>
    kube-system            calico-node-2dxmp                            1/1     Running   0          2d5h   192.168.33.225   nandoc-95         <none>           <none>
    kube-system            calico-node-7ms8t                            1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
    kube-system            calico-node-hdw25                            1/1     Running   1          21d    192.168.33.224   nandoc-94         <none>           <none>
    kube-system            calico-node-j4jv4                            0/1     Running   0          27d    192.168.37.173   cyuan-k8s-node1   <none>           <none>
    kube-system            calicoctl                                    1/1     Running   0          6d     192.168.33.224   nandoc-94         <none>           <none>
    kube-system            coredns-5644d7b6d9-n9z5m                     1/1     Running   0          5d     10.0.126.2       nandoc-93         <none>           <none>
    kube-system            coredns-5644d7b6d9-txcm4                     1/1     Running   0          5d     10.0.100.194     nandoc-94         <none>           <none>
    kube-system            etcd-nandoc-93                               1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
    kube-system            kube-apiserver-nandoc-93                     1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
    kube-system            kube-controller-manager-nandoc-93            1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
    kube-system            kube-proxy-5jlfc                             1/1     Running   0          27d    192.168.37.173   cyuan-k8s-node1   <none>           <none>
    kube-system            kube-proxy-7t7b7                             1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
    kube-system            kube-proxy-j5b4c                             1/1     Running   0          2d5h   192.168.33.225   nandoc-95         <none>           <none>
    kube-system            kube-proxy-jj256                             1/1     Running   1          21d    192.168.33.224   nandoc-94         <none>           <none>
    kube-system            kube-scheduler-nandoc-93                     1/1     Running   0          28d    192.168.33.223   nandoc-93         <none>           <none>
    kube-system            metrics-server-795b774c76-fs8hw              1/1     Running   0          24h    10.0.229.135     nandoc-95         <none>           <none>
    kubernetes-dashboard   dashboard-metrics-scraper-76585494d8-wqgks   1/1     Running   0          5d     10.0.126.3       nandoc-93         <none>           <none>
    kubernetes-dashboard   kubernetes-dashboard-b65488c4-qh95m          1/1     Running   0          5d     10.0.126.4       nandoc-93         <none>           <none>
    

    sudo kubectl get hpa --all-namespaces -o wide

    NAMESPACE   NAME                  REFERENCE                        TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
    k8s-demo    hpa-e7-build-32       Deployment/k8s-pod-e7-build-32   <unknown>/10%   1         2         1          85s
    k8s-demo    hpa-e7-build-64       Deployment/k8s-pod-e7-build-64   <unknown>/10%   1         2         1          79s
    k8s-demo    k8s-pod-e7-build-64   Deployment/k8s-pod-e7-build-64   <unknown>/50%   1         2         1          16s
    

    I update the pod name and recreate hpa, add prefix k8s-pod- today.so the output is different from before.