Kubernetes之服务监控

1、Prometheus

1.1 简介

  • Prometheus是由SoundCloud开发的开源监控报警系统和时序列数据库。Prometheus使用Go语言开发,是Google BorgMon监控系统的开源版本。2016年由Google发起Linux基金会旗下的原生云基金会, 将Prometheus纳入其下第二大开源项目。Prometheus目前在开源社区相当活跃。
  • PrometheusHeapster(HeapsterK8S的一个子项目,用于获取集群的性能数据)相比功能更完善、更全面。Prometheus性能也足够支撑上万台规模的集群。
  • 组件说明:
    • MetricServer:是K8S集群资源使用情况的聚合器,收集数据给K8S集群内使用,如kubectlhpascheduler等。
    • PrometheusOperator:是一个系统监测和警报工具箱,用来存储监控数据。
    • NodeExporter:用于各node的关键度量指标状态数据。
    • KubeStateMetrics:收集K8S集群内资源对象数据,制定告警规则。
    • Prometheus:采用pull方式收集apiserverschedulercontroller-managerkubelet组件数据,通过http协议传输。
    • Grafana:是可视化数据统计和监控平台。

1.2 安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
[root@master Prometheus]# git clone https://github.com/coreos/kube-prometheus.git
[root@master Prometheus]# cd kube-prometheus/manifests/
# 修改grafana-service.yaml、prometheus-service.yaml、alertmanager-service.yaml的类型为NodePort并设置nodePort为30100、30200、30300
[root@master manifests]# vim grafana-service.yaml
[root@master manifests]# cat grafana-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
type: NodePort # 添加
ports:
- name: http
port: 3000
targetPort: http
nodePort: 30100 # 添加
selector:
app: grafana
[root@master manifests]# vim prometheus-service.yaml
[root@master manifests]# cat prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort # 添加
ports:
- name: web
port: 9090
targetPort: web
nodePort: 30200 # 添加
selector:
app: prometheus
prometheus: k8s
sessionAffinity: ClientIP
[root@master manifests]# vim alertmanager-service.yaml
[root@master manifests]# cat alertmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
spec:
type: NodePort # 添加
ports:
- name: web
port: 9093
targetPort: web
nodePort: 30300 # 添加
selector:
alertmanager: main
app: alertmanager
sessionAffinity: ClientIP

# 先将所有docker所需镜像下载,然后用脚本批量导入到工作节点(这里演示node1节点,node2节点进行同样操作)
# 需要的镜像和版本如下:
# REPOSITORY TAG
# quay.io/coreos/kube-state-metrics v1.7.1
# quay.io/prometheus/prometheus v2.11.0
# quay.io/prometheus/alertmanager v0.18.0
# quay.io/coreos/prometheus-config-reloader v0.31.1
# quay.io/coreos/prometheus-operator v0.31.1
# grafana/grafana 6.2.2
# quay.io/prometheus/node-exporter v0.18.1
# quay.io/coreos/kube-rbac-proxy v0.4.1
# k8s.gcr.io/kubernetes-dashboard-amd64 v1.10.1
# quay.io/coreos/k8s-prometheus-adapter-amd64 v0.4.1
# k8s.gcr.io/addon-resizer 1.8.4
# quay.io/coreos/configmap-reload v0.0.1
[root@node1 dockertar]# pwd
/root/Prometheus/dockertar
[root@node1 dockertar]# ll
总用量 683880
-rw-r--r-- 1 root root 38580224 10月 24 16:44 addon-resizer.tar
-rw-r--r-- 1 root root 53299712 10月 24 16:45 alertmanager.tar
-rw-r--r-- 1 root root 4800000 10月 24 16:46 configmap-reload.tar
-rw-r--r-- 1 root root 253934080 10月 24 16:48 grafana.tar
-rw-r--r-- 1 root root 60977152 10月 24 16:59 k8s-prometheus-adapter-amd64.tar
-rw-r--r-- 1 root root 41879040 10月 24 16:50 kube-rbac-proxy.tar
-rw-r--r-- 1 root root 34397184 10月 24 16:51 kube-state-metrics.tar
-rw-r--r-- 1 root root 24351232 10月 24 16:52 node-exporter.tar
-rw-r--r-- 1 root root 19025920 10月 24 16:53 prometheus-config-reloader.tar
-rw-r--r-- 1 root root 41896448 10月 24 16:54 prometheus-operator.tar
-rw-r--r-- 1 root root 127138816 10月 24 16:55 prometheus.tar
[root@node1 Prometheus]# cat load.sh
#!/bin/bash
ls /root/Prometheus/dockertar > /root/Prometheus/images.txt
cd /root/Prometheus/dockertar
for i in $( cat /root/Prometheus/images.txt )
do
docker load -i $i
done
[root@node1 Prometheus]# bash ./load.sh

# 回到master节点运行prometheus,多尝试运行几次,因为里面启动的pod需要相互链接
[root@master manifests]# kubectl apply -f ../manifests/
[root@master manifests]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 102s
alertmanager-main-1 2/2 Running 0 102s
alertmanager-main-2 2/2 Running 0 102s
grafana-57bfdd47f8-6d5rt 1/1 Running 0 3m3s
kube-state-metrics-65d5b4b99d-q8cgj 4/4 Running 0 97s
node-exporter-d49lb 2/2 Running 0 3m3s
node-exporter-fdnkw 2/2 Running 0 3m3s
node-exporter-fr64w 2/2 Running 0 3m3s
prometheus-adapter-668748ddbd-q7lj7 1/1 Running 0 3m2s
prometheus-k8s-0 3/3 Running 1 101s
prometheus-k8s-1 3/3 Running 1 101s
prometheus-operator-8c7f75674-9ctqt 2/2 Running 0 3m6s
[root@master manifests]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master 269m 13% 1150Mi 61%
node1 176m 8% 1125Mi 60%
node2 202m 10% 1299Mi 69%
[root@master manifests]# kubectl top pod
NAME CPU(cores) MEMORY(bytes)
kubernetes-dashboard-77f54dc48f-6fnxz 0m 18Mi
  • 访问[Prometheus Time Series Collection and Processing Server](http://192.168.200.20:30200/graph)即可看到操作界面。

    1635081812320
  • 除了页面可点击获取的信息外,prometheusWEB界面上提供了基本的查询K8S集群中每个PodCPU使用情况,查询条件如下:

    1
    sum by (pod_name)( rate(container_cpu_usage_seconds_total{image!="", pod_name!=""}[1m] ) )
  • 访问grafana[Grafana](http://192.168.200.20:30100/login) ,接着使用初始账号密码为admin登录即可。

2、Horizontal Pod Autoscaling

  • Horizontal Pod Autoscaling可以根据CPU利用率自动伸缩一个Replication ControllerDeployment或者Replica Set中的Pod数量。

    • ①在所有节点安装HPA镜像。

      1
      2
      3
      4
      5
      6
      [root@master ~]# docker load -i hpa-example.tar
      [root@node1 ~]# docker load -i hpa-example.tar
      [root@node2 ~]# docker load -i hpa-example.tar
      [root@node1 ~]# docker images
      REPOSITORY TAG IMAGE ID CREATED SIZE
      gcr.io/google_containers/hpa-example latest 4ca4c13a6d7c 6 years ago 481MB
    • ②创建名为php-apache的pod。

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      [root@master ~]# kubectl run php-apache --image=gcr.io/google_containers/hpa-example --requests=cpu=200m --expose --port=80
      kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
      service/php-apache created
      deployment.apps/php-apache created
      # 将deployment的镜像拉取策略改为Never,即imagePullPolicy: Never
      [root@master ~]# kubectl edit deployment php-apache
      deployment.extensions/php-apache edited
      [root@master ~]# kubectl get pod
      NAME READY STATUS RESTARTS AGE
      php-apache-f44dcdb46-lp2q2 1/1 Running 0 67s
      [root@master ~]# kubectl get deployment
      NAME READY UP-TO-DATE AVAILABLE AGE
      php-apache 1/1 1 1 3m49s
    • ③创建HPA控制器。

      1
      2
      3
      4
      5
      6
      # 设置当cpu的利用率大于50%时会创建新pod,但最多不超过10个pod,而当cpu利用率小于%50时又会重新减少pod节点数,但最小不小于1个pod。
      [root@master ~]# kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
      horizontalpodautoscaler.autoscaling/php-apache autoscaled
      [root@master ~]# kubectl get hpa
      NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
      php-apache Deployment/php-apache 0%/50% 1 10 1 37s
    • ④增加负载,查看负载节点数目。

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      [root@master ~]# kubectl run -i --tty load-generator --image=busybox /bin/sh
      kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
      If you don't see a command prompt, try pressing enter.
      / # while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done
      [root@master ~]# kubectl get hpa
      NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
      php-apache Deployment/php-apache 172%/50% 1 10 4 3m10s
      [root@master ~]# kubectl get pod
      NAME READY STATUS RESTARTS AGE
      load-generator-7d549cd44-k7qt7 1/1 Running 0 2m1s
      php-apache-f44dcdb46-4bff5 1/1 Running 0 19s
      php-apache-f44dcdb46-58xxk 1/1 Running 0 49s
      php-apache-f44dcdb46-j7lb5 1/1 Running 0 49s
      php-apache-f44dcdb46-lp2q2 1/1 Running 0 7m18s
      php-apache-f44dcdb46-tkgls 1/1 Running 0 49s
      php-apache-f44dcdb46-wp6pz 1/1 Running 0 19s
      [root@master ~]# kubectl top pod
      NAME CPU(cores) MEMORY(bytes)
      php-apache-f44dcdb46-58xxk 198m 14Mi
      php-apache-f44dcdb46-lp2q2 204m 31Mi

3、资源限制

3.1 Pod

  • Kubernetes对资源的限制实际上是通过cgroup来控制的,cgroup是容器的一组用来控制内核如何运行进程的相关属性集合。针对内存、CPU和各种设备都有对应的cgroup

  • 默认情况下,Pod运行没有CPU和内存的限额。这意味着系统中的任何Pod将能够像执行该Pod所在的节点一样,消耗足够多的CPU和内存。一般会针对某些应用的pod资源进行资源限制,这个资源限制是通过resourcesrequests(初始为pod分配的资源)和limits(pod能使用的最大资源数)来实现。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    spec:   
    containers:
    - image: xxxx
    imagePullPolicy: Always
    name: auth
    resources:
    limits:
    cpu: "4"
    memory: 2Gi
    requests:
    cpu: 250m
    memory: 250Mi

3.2 名称空间

  • 计算资源配额:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    apiVersion: v1
    kind: ResourceQuota
    metadata:
    name: compute-resources
    spec:
    hard:
    pods: "20"
    requests.cpu: "20"
    requests.memory: 100Gi
    limits.cpu: "40"
    limits.memory: 200Gi
  • 配置对象数量配额限制:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    apiVersion: v1
    kind: ResourceQuota
    metadata:
    name: object-counts
    spec:
    hard:
    configmaps: "10"
    persistentvolumeclaims: "4"
    replicationcontrollers: "20"
    secrets: "10"
    services: "10"
    services.loadbalancers: "2"
  • 配置CPU和内存LimitRange(避免OOM):

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    apiVersion: v1
    kind: LimitRange
    metadata:
    name: mem-limit-range
    spec:
    limits:
    - default: # 即limit的值
    memory: 50Gi
    cpu: 5
    defaultRequest: # 即request的值
    memory: 1Gi
    cpu: 1
    type: Container