{"id":46,"date":"2024-05-14T10:25:47","date_gmt":"2024-05-14T02:25:47","guid":{"rendered":"https:\/\/www.sretalk.com\/?p=46"},"modified":"2024-08-11T14:55:28","modified_gmt":"2024-08-11T06:55:28","slug":"%e5%90%af%e7%94%a8-prometheus-%e7%9b%91%e6%8e%a7%e7%b1%bb","status":"publish","type":"post","link":"https:\/\/www.sretalk.com\/?p=46","title":{"rendered":"\u542f\u7528 Prometheus \u76d1\u63a7\u7c7b"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">\u4e00 \u6982\u8ff0<\/h2>\n\n\n\n<p>\u4ece Zookeeper <code>3.6.0<\/code> \u7248\u672c\u5f00\u59cb\u5df2\u7ecf\u5185\u7f6e Prometheus Client\uff0c\u901a\u8fc7\u914d\u7f6e\u53ef\u4ee5\u901a\u8fc7 <code>\/metrics<\/code> \u63a5\u53e3\u8fdb\u884c\u66b4\u9732\u7ed9 Prometheus \u8fdb\u884c\u76d1\u63a7\u3002<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><\/p>\n<cite>Notes: \u53c2\u8003 <a href=\"https:\/\/zookeeper.apache.org\/doc\/current\/zookeeperMonitor.html\" target=\"_blank\" rel=\"noreferrer noopener\">\u300aZooKeeper Monitor Guide\u300b<\/a><\/cite><\/blockquote>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u4e8c \u542f\u7528<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">2.1 \u5907\u4efd\u914d\u7f6e\u6587\u4ef6<\/h3>\n\n\n\n<pre class=\"wp-block-code language-bash\"><code># \u6ce8\uff1a\u6211\u7684\u73af\u5883\u7684\u8def\u5f84\u662f \/etc\/zookeeper\/zoo.cfg\ncp \/etc\/zookeeper\/zoo.cfg \/etc\/zookeeper\/zoo.cfg-$(date +'%s')<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2.2 \u542f\u7528 PrometheusMetricsProvider \u7c7b<\/h3>\n\n\n\n<pre class=\"wp-block-code language-bash\"><code># \u7f16\u8f91 zoo.cfg\uff0c\u5e76\u6dfb\u52a0\u4ee5\u4e0b\u5185\u5bb9\nvim \/etc\/zookeeper\/zoo.cfg\n\n## \u542f\u7528 Prometheus\u7c7b\nmetricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider\n## \u5b9a\u4e49\u4fa6\u542c\u7684\u7aef\u53e3\nmetricsProvider.httpPort=4888<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">2.3 \u91cd\u542f Zookeeper \u670d\u52a1<\/h3>\n\n\n\n<pre class=\"wp-block-code language-bash\"><code># \u6211\u7684 Zookeeper \u670d\u52a1\u662f\u4f7f\u7528 systemd \u8fdb\u884c\u6258\u7ba1\uff0c\u8fdb\u884c\u53ef\u4ee5\u76f4\u63a5\u4f7f\u7528 systemctl \u8fdb\u884c\u91cd\u542f\nsystemctl restart zookeeper.serivce\n\n# \u5982\u679c\u662f\u7528 zkServer.sh \u542f\u52a8\u670d\u52a1\u7684\uff0c\u53ef\u4ee5\uff1a\n.\/zkServer.sh restart<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u4e09 \u68c0\u67e5<\/h2>\n\n\n\n<p>\u6b63\u5e38\u542f\u52a8\u6ca1\u6709\u95ee\u9898\u540e\uff0c\u53ef\u4ee5\u8bf7\u6c42 <code>http:\/\/&lt;IP\u5730\u5740&gt;:4888\/metrics<\/code> \u5730\u5740\uff0c\u5e94\u8be5\u4f1a\u8fd4\u56de\u76f8\u5173\u7684 Prometheus \u6307\u6807\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code language-bash\"><code>curl http:\/\/localhost:4888\/metrics\n\n\n# \u8fd4\u56de\u5982\u4e0b:\n# HELP learner_request_processor_queue_size learner_request_processor_queue_size\n# TYPE learner_request_processor_queue_size summary\nlearner_request_processor_queue_size{quantile=\"0.5\",} NaN\nlearner_request_processor_queue_size_count 0.0\nlearner_request_processor_queue_size_sum 0.0\n# HELP response_packet_cache_hits response_packet_cache_hits\n# TYPE response_packet_cache_hits counter\nresponse_packet_cache_hits 0.0\n# HELP read_commit_proc_req_queued read_commit_proc_req_queued\n# TYPE read_commit_proc_req_queued summary\nread_commit_proc_req_queued{quantile=\"0.5\",} NaN\nread_commit_proc_req_queued_count 0.0\nread_commit_proc_req_queued_sum 0.0\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">\u56db \u63a5\u5165 Prometheus<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">4.1 \u63a5\u5165\u91c7\u96c6<\/h3>\n\n\n\n<p>\u5f53\u73af\u5883\u4e2d\u6709\u90e8\u7f72 Prometheus \u65f6, \u53ef\u4ee5\u6309\u4ee5\u4e0b\u65b9\u5f0f\u628a Zookeeper \u8fdb\u884c\u63a5\u5165:<\/p>\n\n\n\n<pre class=\"wp-block-code language-bash\"><code># 1. \u7f16\u8f91 prometheus.yaml \u6587\u4ef6\uff0c\u5728 scrape_configs \u4e0b\u6dfb\u52a0:\nscrape_configs:\n  - job_name: \"zookeeper\"\n    static_configs:\n      - targets: &#91;\"10.0.0.1:4888\"]\n\n# 2. \u70ed\u91cd\u8f7d Prometheus\ncurl -XPOST http:\/\/localhost:9090\/-\/reload<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><\/p>\n<cite>Notes: <br>Prometheus \u7684\u70ed\u52a0\u8f7d\u9700\u8981\u5148\u5b9a\u4e49\u542f\u52a8\u53c2\u6570 <code>--web.enable-lifecycle<\/code><\/cite><\/blockquote>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4.2 \u63a5\u5165\u544a\u8b66<\/h3>\n\n\n\n<p>\u540c\u65f6\uff0cZookeeper \u5b98\u65b9\u63d0\u4f9b\u4e86\u57fa\u4e8e Prometheus \u7684\u544a\u8b66\u89c4\u5219\u793a\u4f8b\uff0c\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u6dfb\u52a0 <code>zk.yaml<\/code> \u6587\u4ef6\u5230 Prometheus \u7684 <code>rules<\/code> \u8def\u5f84\u4e0b\uff0c\u4ee5\u914d\u7f6e\u544a\u8b66\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code language-yaml\"><code>groups:\n- name: zk-alert-example\n  rules:\n  - alert: ZooKeeper server is down\n    expr:  up == 0\n    for: 1m\n    labels:\n      severity: critical\n    annotations:\n      summary: \"Instance {{ $labels.instance }} ZooKeeper server is down\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} ZooKeeper server is down: &#91;{{ $value }}].\"\n\n  - alert: create too many znodes\n    expr: znode_count &gt; 1000000\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} create too many znodes\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} create too many znodes: &#91;{{ $value }}].\"\n\n  - alert: create too many connections\n    expr: num_alive_connections &gt; 50 # suppose we use the default maxClientCnxns: 60\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} create too many connections\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} create too many connections: &#91;{{ $value }}].\"\n\n  - alert: znode total occupied memory is too big\n    expr: approximate_data_size \/1024 \/1024 &gt; 1 * 1024 # more than 1024 MB(1 GB)\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} znode total occupied memory is too big\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} znode total occupied memory is too big: &#91;{{ $value }}] MB.\"\n\n  - alert: set too many watch\n    expr: watch_count &gt; 10000\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} set too many watch\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} set too many watch: &#91;{{ $value }}].\"\n\n  - alert: a leader election happens\n    expr: increase(election_time_count&#91;5m]) &gt; 0\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} a leader election happens\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} a leader election happens: &#91;{{ $value }}].\"\n\n  - alert: open too many files\n    expr: open_file_descriptor_count &gt; 300\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} open too many files\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} open too many files: &#91;{{ $value }}].\"\n\n  - alert: fsync time is too long\n    expr: rate(fsynctime_sum&#91;1m]) &gt; 100\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} fsync time is too long\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} fsync time is too long: &#91;{{ $value }}].\"\n\n  - alert: take snapshot time is too long\n    expr: rate(snapshottime_sum&#91;5m]) &gt; 100\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} take snapshot time is too long\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} take snapshot time is too long: &#91;{{ $value }}].\"\n\n  - alert: avg latency is too high\n    expr: avg_latency &gt; 100\n    for: 1m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"Instance {{ $labels.instance }} avg latency is too high\"\n      description: \"{{ $labels.instance }} of job {{$labels.job}} avg latency is too high: &#91;{{ $value }}].\"\n\n  - alert: JvmMemoryFillingUp\n    expr: jvm_memory_bytes_used \/ jvm_memory_bytes_max{area=\"heap\"} &gt; 0.8\n    for: 5m\n    labels:\n      severity: warning\n    annotations:\n      summary: \"JVM memory filling up (instance {{ $labels.instance }})\"\n      description: \"JVM memory is filling up (&gt; 80%)\\n labels: {{ $labels }}  value = {{ $value }}\\n\"<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>\u63a5\u5165\u540e\u5373\u53ef\u4ee5 Prometheus \u7684 Alerts \u9875\u9762\u770b\u5230\u5982\u4e0b\u7b56\u7565:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/sretalk-1258684427.cos.ap-shanghai.myqcloud.com\/2024\/05\/image-1.png'><img class=\"lazyload lazyload-style-1\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  loading=\"lazy\" decoding=\"async\" width=\"3788\" height=\"900\" data-original=\"https:\/\/sretalk-1258684427.cos.ap-shanghai.myqcloud.com\/2024\/05\/image-1.png\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"\" class=\"wp-image-50\"  sizes=\"auto, (max-width: 3788px) 100vw, 3788px\" \/><\/div><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\u4e94 \u914d\u7f6e Grafana \u89c6\u56fe<\/h2>\n\n\n\n<p>\u540c\u6837\uff0cZookeeper \u4e5f\u63d0\u4f9b\u4e86\u5bf9\u5e94\u7684 Grafana \u89c6\u56fe <a href=\"https:\/\/grafana.com\/grafana\/dashboards\/10465-zookeeper-by-prometheus\/\" data-type=\"link\" data-id=\"https:\/\/grafana.com\/grafana\/dashboards\/10465-zookeeper-by-prometheus\/\" target=\"_blank\" rel=\"noreferrer noopener\">ZooKeeper by Prometheus<\/a>\uff0c\u5bfc\u5165\u540e\u5373\u53ef\u770b\u5230\u4e0b\u4ee5 Dashboard:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/sretalk-1258684427.cos.ap-shanghai.myqcloud.com\/2024\/05\/image-2.png'><img class=\"lazyload lazyload-style-1\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  loading=\"lazy\" decoding=\"async\" width=\"3840\" height=\"1907\" data-original=\"https:\/\/sretalk-1258684427.cos.ap-shanghai.myqcloud.com\/2024\/05\/image-2.png\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"\" class=\"wp-image-53\"  sizes=\"auto, (max-width: 3840px) 100vw, 3840px\" \/><\/div><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>\u4e00 \u6982\u8ff0 \u4ece Zookeeper 3.6.0 \u7248\u672c\u5f00\u59cb\u5df2\u7ecf\u5185\u7f6e Prometheus Client\uff0c\u901a\u8fc7\u914d\u7f6e\u53ef [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,5],"tags":[],"class_list":["post-46","post","type-post","status-publish","format-standard","hentry","category-zk","category-5"],"_links":{"self":[{"href":"https:\/\/www.sretalk.com\/index.php?rest_route=\/wp\/v2\/posts\/46","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sretalk.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sretalk.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sretalk.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sretalk.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=46"}],"version-history":[{"count":7,"href":"https:\/\/www.sretalk.com\/index.php?rest_route=\/wp\/v2\/posts\/46\/revisions"}],"predecessor-version":[{"id":136,"href":"https:\/\/www.sretalk.com\/index.php?rest_route=\/wp\/v2\/posts\/46\/revisions\/136"}],"wp:attachment":[{"href":"https:\/\/www.sretalk.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=46"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sretalk.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=46"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sretalk.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=46"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}