mirror of
https://github.com/docker/awesome-compose.git
synced 2025-04-19 15:28:06 +02:00
Merge de7e522a42
into 18f59bdb09
This commit is contained in:
commit
1cf746bec8
21 changed files with 9125 additions and 0 deletions
10
prometheus-grafana-provision/.env
Normal file
10
prometheus-grafana-provision/.env
Normal file
|
@ -0,0 +1,10 @@
|
|||
DB_HOST=postgresql
|
||||
DB_PORT=5432
|
||||
DB_NAME=monitoring
|
||||
|
||||
DB_USERNAME=homestead
|
||||
DB_PASSWORD=secret
|
||||
|
||||
PROMETHEUS_PORT=9090
|
||||
GRAFANA_PORT=3000
|
||||
GRAFANA_PASSWORD=secret
|
242
prometheus-grafana-provision/README.md
Normal file
242
prometheus-grafana-provision/README.md
Normal file
|
@ -0,0 +1,242 @@
|
|||
# Grafana-11.5-with-Prometheus-and-Docker-Compose
|
||||
A Short Guide how to use Grafana 11.5 with Docker Compose. You will learn how to Provision Grafana so it comes with you Dashboards, Alerts, Datasources, Login, and Datasource-Configurations.
|
||||
|
||||
Your team members can use Grafana with one command without going through the painful provision process.
|
||||
|
||||
***preview***
|
||||

|
||||
|
||||
|
||||
- we will deploy grafana in combination with prometheus.
|
||||
- We will scrape the following containers:
|
||||
- redis
|
||||
- postgres
|
||||
|
||||
|
||||
## Grafana Container Structure
|
||||

|
||||
|
||||
The default folders for provisioning are in `/etc/grafana/`
|
||||
- the config file is stored under `/etc/grafana/grafana.ini`
|
||||
- the yml files for provisioning the dashboards, alerts, etc, are located in `/etc/grafana/provisioning`
|
||||
- so far i had no luck copiying the entire folder to the container, so i needed to do this one by one
|
||||
|
||||
***we will use a similar folderstructure for provisioning***
|
||||
```
|
||||
grafana/
|
||||
|-- grafana.ini
|
||||
|-- ldap.toml
|
||||
|-- provisioning/
|
||||
|-- access-control/
|
||||
| |-- default.yml
|
||||
|-- alerting/
|
||||
| |-- default.yml
|
||||
|-- dashboards/
|
||||
| |-- default.yml
|
||||
|-- datasources/
|
||||
| |-- default.yml
|
||||
|-- notifiers/
|
||||
|-- default.yml
|
||||
|-- my_dashboards/
|
||||
|-- postgres.json
|
||||
|-- redis.json
|
||||
```
|
||||
> create the folders accordingly in your project
|
||||
|
||||
## files for provisioning
|
||||
|
||||
- create your files for provisioning
|
||||
- keep the folder structure, so you dont need to rename files
|
||||
|
||||
|
||||
### dashboards
|
||||
```yaml
|
||||
apiVersion: 1
|
||||
providers:
|
||||
- name: 'default'
|
||||
orgId: 1
|
||||
folder: ''
|
||||
folderUid: ''
|
||||
type: file
|
||||
options:
|
||||
path: /home/grafana/dashboards/postgres.json
|
||||
```
|
||||
> ./grafana/provisioning/dashboards/default.yml
|
||||
|
||||
### datasources
|
||||
> we hardcoded the ip in the compose file so we can use it here and no manual steps are required after the provisioning
|
||||
```yaml
|
||||
apiVersion: 1
|
||||
datasources:
|
||||
- name: Prometheus
|
||||
type: prometheus
|
||||
url: http://200.0.0.10:9090
|
||||
isDefault: true
|
||||
access: proxy
|
||||
editable: true
|
||||
```
|
||||
> ./grafana/provisioning/datasources/default.yml
|
||||
|
||||
### file permissions
|
||||
- after you created your files you need to set permissions accordingly, or grafana container might fail
|
||||
- cd to your root project folder
|
||||
```bash
|
||||
sudo chmod -R +x ./grafana
|
||||
sudo chmod -R 775 ./grafana
|
||||
sudo chown -R 1000:1000 ./grafana
|
||||
```
|
||||
- make sure to do the same for the prometheus.yml in the prometheus folder
|
||||
> you can check permissions by using `ll -R`
|
||||
|
||||
- ***we can also set the paths were we want to copy our provisioning files to, by using environment variables:***
|
||||
|
||||
### Default paths && environment variables
|
||||
from the grafana docs - [configure-docker](https://grafana.com/docs/grafana/latest/setup-grafana/configure-docker/)
|
||||
> Grafana comes with default configuration parameters that remain the same among versions regardless of the operating system or the environmen
|
||||
>
|
||||
> The following configurations are set by default when you start the Grafana Docker container. When running in Docker you cannot change the configurations by editing the conf/grafana.ini file. Instead, you can modify the configuration using environment variables.
|
||||
|
||||
| Setting | Default value |
|
||||
| --- | --- |
|
||||
| GF_PATHS_CONFIG | /etc/grafana/grafana.ini |
|
||||
| GF_PATHS_DATA | /var/lib/grafana |
|
||||
| GF_PATHS_HOME | /usr/share/grafana |
|
||||
| GF_PATHS_LOGS | /var/log/grafana |
|
||||
| GF_PATHS_PLUGINS | /var/lib/grafana/plugins |
|
||||
| GF_PATHS_PROVISIONING | /etc/grafana/provisioning |
|
||||
|
||||
### environment variables
|
||||
- add a .env file in your docker compose project directory to use values like `${GRAFANA_PASSWORD}` in your docker compose file
|
||||
```yml
|
||||
DB_HOST=postgresql
|
||||
DB_PORT=5432
|
||||
DB_NAME=monitoring
|
||||
|
||||
DB_USERNAME=homestead
|
||||
DB_PASSWORD=secret
|
||||
|
||||
PROMETHEUS_PORT=9090
|
||||
GRAFANA_PORT=3000
|
||||
GRAFANA_PASSWORD=secret
|
||||
```
|
||||
|
||||
## docker compose file
|
||||
> ***the example file: [docker.compose.yml](https://github.com/user-attachments/assets/ac9cafe7-fcb0-47e6-8d40-6119b352888e)***
|
||||
### prometheus ip
|
||||
- notice that we hardcode a ip to prometheus, so we can use it in the grafana datasources provisioning file
|
||||
```yml
|
||||
...
|
||||
networks:
|
||||
network1:
|
||||
ipv4_address: 200.0.0.10
|
||||
```
|
||||
### prometheus volumes
|
||||
in order to define our exports (we will explain this later), we need to copy the prometheus.yml from `./prometheus/prometheus.yml` to `/etc/prometheus.yml`
|
||||
```
|
||||
volumes:
|
||||
- ./prometheus/prometheus.yml:/etc/prometheus.yml
|
||||
```
|
||||
|
||||
### grafana volumes
|
||||
- 1. we create the required grafana volume
|
||||
- 2. we copy our user dashboards over to `/home/grafana/dashboards`
|
||||
3. we overwrite the grafana.ini config file in `/etc/grafana/grafana.ini`
|
||||
4. we create the default.yml for the dashboards in `/etc/grafana/provisioning/dashboards/default.yml`
|
||||
5. we create the default.yml for the datasources with the hardcoded prometheus ip in `/etc/grafana/provisioning/datasources/default.yml`
|
||||
6. we create the default.yml for the alerting
|
||||
```yml
|
||||
---
|
||||
volumes:
|
||||
- grafana:/var/lib/grafana
|
||||
- ./grafana/my_dashboards:/home/grafana/dashboards
|
||||
- ./grafana/defaults.ini:/etc/grafana/grafana.ini
|
||||
- ./grafana/provisioning/dashboards/default.yml:/etc/grafana/provisioning/dashboards/default.yml
|
||||
- ./grafana/provisioning/datasources/default.yml:/etc/grafana/provisioning/datasources/default.yml
|
||||
- ./grafana/provisioning/alerting/default.yml:/etc/grafana/provisioning/alerting/default.yml
|
||||
```
|
||||
|
||||
### grafana environment variables in docker compose
|
||||
- we can set certain grafana environment variables in the compose file
|
||||
- the values for this can be set in the .env folder as we explained before
|
||||
```yml
|
||||
...
|
||||
environment:
|
||||
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
|
||||
GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH: /home/grafana/dashboards/postgres.json
|
||||
GF_USERS_ALLOW_SIGN_UP: false
|
||||
GF_PATHS_CONFIG: /etc/grafana/grafana.ini
|
||||
```
|
||||
> here we can set a few values like the default config path etc
|
||||
|
||||
### exporters
|
||||
[grafana docs](https://grafana.com/oss/prometheus/exporters/):
|
||||
> Exporters transform metrics from specific sources into a format that can be ingested by Prometheus
|
||||
- so in order to scrape metrics from our services, we need to have a additional container deployment that sends the metrics to prometheus
|
||||
- here is the exporter for postgresql
|
||||
```yml
|
||||
postgresql-exporter:
|
||||
image: prometheuscommunity/postgres-exporter
|
||||
container_name: postgresql-exporter
|
||||
privileged: true
|
||||
ports:
|
||||
- "9187:9187"
|
||||
environment:
|
||||
DATA_SOURCE_NAME: "postgres://${DB_USERNAME}:${DB_PASSWORD}@${DB_HOST}/${DB_NAME}?sslmode=disable"
|
||||
depends_on:
|
||||
prometheus:
|
||||
condition: service_started
|
||||
postgresql:
|
||||
condition: service_healthy
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- network1
|
||||
```
|
||||
## prometheus.yml
|
||||
```yml
|
||||
global:
|
||||
scrape_interval: 15s
|
||||
evaluation_interval: 15s
|
||||
scrape_configs:
|
||||
- job_name: 'envoy'
|
||||
metrics_path: /stats/prometheus
|
||||
static_configs:
|
||||
- targets: ['envoy:19000']
|
||||
labels:
|
||||
group: 'envoy'
|
||||
- job_name: postgresql
|
||||
static_configs:
|
||||
- targets: ['postgresql-exporter:9187']
|
||||
|
||||
- job_name: redis_exporter
|
||||
static_configs:
|
||||
- targets: ['redis-exporter:9121']
|
||||
```
|
||||
|
||||
- in the prometheus yml we need to define our targets of our exporters
|
||||
- we use the ports that we defined in the playbook for the exporters, so for postgres its 9187
|
||||
|
||||
# Results
|
||||
### we configured prometheus
|
||||

|
||||
|
||||
### we automatically set up prometheus as the datasource
|
||||

|
||||
|
||||
### our admin account is configured accordingly
|
||||

|
||||
> we can even set acl and administration but that was a bit much for the example
|
||||
> however the steps are the same
|
||||
### we automatically provide our dashboards
|
||||

|
||||
|
||||
### Postgres Dashboard
|
||||

|
||||
|
||||
### envoy dashboards are up and running
|
||||

|
||||
|
||||
### redis dashboard is up and running
|
||||

|
||||
|
||||
|
||||
|
7
prometheus-grafana-provision/docker-compose.local.yml
Normal file
7
prometheus-grafana-provision/docker-compose.local.yml
Normal file
|
@ -0,0 +1,7 @@
|
|||
version: '3'
|
||||
|
||||
services:
|
||||
redis:
|
||||
ports:
|
||||
- "6380:6379"
|
||||
|
126
prometheus-grafana-provision/docker-compose.yml
Normal file
126
prometheus-grafana-provision/docker-compose.yml
Normal file
|
@ -0,0 +1,126 @@
|
|||
networks:
|
||||
network1:
|
||||
driver: bridge
|
||||
ipam:
|
||||
config:
|
||||
- subnet: 200.0.0.0/16
|
||||
gateway: 200.0.0.1
|
||||
volumes:
|
||||
postgresql:
|
||||
prometheus:
|
||||
grafana:
|
||||
services:
|
||||
#---------- >> POSTGRES << ----------
|
||||
postgresql:
|
||||
image: postgres:15.4
|
||||
hostname: postgresql
|
||||
container_name: postgresql
|
||||
privileged: true
|
||||
environment:
|
||||
POSTGRES_USER: ${DB_USERNAME}
|
||||
POSTGRES_PASSWORD: ${DB_PASSWORD}
|
||||
POSTGRES_DB: ${DB_NAME}
|
||||
PGDATA: /data/postgres
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -d ${DB_NAME} -U ${DB_USERNAME}"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
volumes:
|
||||
- postgresql:/data/postgres
|
||||
ports:
|
||||
- "5432:5432"
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
network1:
|
||||
ipv4_address: 200.0.0.4
|
||||
# ---------- >> REDIS << ----------
|
||||
redis:
|
||||
image: redis:latest
|
||||
privileged: true
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "redis-cli ping | grep PONG"]
|
||||
interval: 1s
|
||||
timeout: 3s
|
||||
retries: 5
|
||||
command: ["redis-server"]
|
||||
ports:
|
||||
- "6379:6379"
|
||||
volumes:
|
||||
- ./redis/redis.conf:/usr/local/etc/redis.conf
|
||||
networks:
|
||||
network1:
|
||||
ipv4_address: 200.0.0.5
|
||||
# ---------- >> PROMETHEUS << ----------
|
||||
prometheus:
|
||||
image: prom/prometheus
|
||||
hostname: prom
|
||||
container_name: prometheus
|
||||
privileged: true
|
||||
volumes:
|
||||
- ./prometheus/:/etc/prometheus/
|
||||
- prometheus:/prometheus
|
||||
command:
|
||||
- '--config.file=/etc/prometheus/prometheus.yml'
|
||||
- '--storage.tsdb.path=/prometheus'
|
||||
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
|
||||
- '--web.console.templates=/usr/share/prometheus/consoles'
|
||||
ports:
|
||||
- "${PROMETHEUS_PORT:-9090}:9090"
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
network1:
|
||||
ipv4_address: 200.0.0.10
|
||||
# ---------- >> GRAFANA << ----------
|
||||
grafana:
|
||||
image: grafana/grafana
|
||||
container_name: grafana
|
||||
volumes:
|
||||
- grafana:/var/lib/grafana
|
||||
- ./grafana/my_dashboards:/home/grafana/dashboards
|
||||
- ./grafana/defaults.ini:/etc/grafana/grafana.ini
|
||||
- ./grafana/provisioning/dashboards/default.yml:/etc/grafana/provisioning/dashboards/default.yml
|
||||
- ./grafana/provisioning/datasources/default.yml:/etc/grafana/provisioning/datasources/default.yml
|
||||
- ./grafana/provisioning/alerting/default.yml:/etc/grafana/provisioning/alerting/default.yml
|
||||
privileged: true
|
||||
environment:
|
||||
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
|
||||
GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH: /home/grafana/dashboards/postgres.json
|
||||
GF_USERS_ALLOW_SIGN_UP: false
|
||||
GF_PATHS_CONFIG: /etc/grafana/grafana.ini
|
||||
ports:
|
||||
- "${GRAFANA_PORT:-3000}:3000"
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
network1:
|
||||
ipv4_address: 200.0.0.11
|
||||
# ---------- >> EXPORTERS << ----------
|
||||
postgresql-exporter:
|
||||
image: prometheuscommunity/postgres-exporter
|
||||
container_name: postgresql-exporter
|
||||
privileged: true
|
||||
ports:
|
||||
- "9187:9187"
|
||||
environment:
|
||||
DATA_SOURCE_NAME: "postgres://${DB_USERNAME}:${DB_PASSWORD}@${DB_HOST}/${DB_NAME}?sslmode=disable"
|
||||
depends_on:
|
||||
prometheus:
|
||||
condition: service_started
|
||||
postgresql:
|
||||
condition: service_healthy
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- network1
|
||||
|
||||
redis-exporter:
|
||||
image: oliver006/redis_exporter
|
||||
container_name: redis-exporter
|
||||
ports:
|
||||
- "9121:9121"
|
||||
environment:
|
||||
REDIS_ADDR: "redis:6379"
|
||||
networks:
|
||||
- network1
|
||||
depends_on:
|
||||
- redis
|
||||
|
5
prometheus-grafana-provision/grafana/Dockerfile
Executable file
5
prometheus-grafana-provision/grafana/Dockerfile
Executable file
|
@ -0,0 +1,5 @@
|
|||
|
||||
FROM grafana/grafana-oss:latest
|
||||
|
||||
COPY grafana.ini /etc/grafana/grafana.ini
|
||||
COPY my_dashboards /etc/grafana/provisioning/dashboards
|
2108
prometheus-grafana-provision/grafana/defaults.ini
Executable file
2108
prometheus-grafana-provision/grafana/defaults.ini
Executable file
File diff suppressed because it is too large
Load diff
75
prometheus-grafana-provision/grafana/ldap.toml
Executable file
75
prometheus-grafana-provision/grafana/ldap.toml
Executable file
|
@ -0,0 +1,75 @@
|
|||
# To troubleshoot and get more log info enable ldap debug logging in grafana.ini
|
||||
# [log]
|
||||
# filters = ldap:debug
|
||||
|
||||
[[servers]]
|
||||
# Ldap server host (specify multiple hosts space separated)
|
||||
host = "127.0.0.1"
|
||||
# Default port is 389 or 636 if use_ssl = true
|
||||
port = 389
|
||||
# Set to true if LDAP server should use an encrypted TLS connection (either with STARTTLS or LDAPS)
|
||||
use_ssl = false
|
||||
# If set to true, use LDAP with STARTTLS instead of LDAPS
|
||||
start_tls = false
|
||||
# The value of an accepted TLS cipher. By default, this value is empty. Example value: ["TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"])
|
||||
# For a complete list of supported ciphers and TLS versions, refer to: https://go.dev/src/crypto/tls/cipher_suites.go
|
||||
# Starting with Grafana v11.0 only ciphers with ECDHE support are accepted for TLS 1.2 connections.
|
||||
tls_ciphers = []
|
||||
# This is the minimum TLS version allowed. By default, this value is empty. Accepted values are: TLS1.1 (only for Grafana v10.4 or older), TLS1.2, TLS1.3.
|
||||
min_tls_version = ""
|
||||
# set to true if you want to skip ssl cert validation
|
||||
ssl_skip_verify = false
|
||||
# set to the path to your root CA certificate or leave unset to use system defaults
|
||||
# root_ca_cert = "/path/to/certificate.crt"
|
||||
# Authentication against LDAP servers requiring client certificates
|
||||
# client_cert = "/path/to/client.crt"
|
||||
# client_key = "/path/to/client.key"
|
||||
|
||||
# Search user bind dn
|
||||
bind_dn = "cn=admin,dc=grafana,dc=org"
|
||||
# Search user bind password
|
||||
# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
|
||||
bind_password = 'grafana'
|
||||
# We recommend using variable expansion for the bind_password, for more info https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#variable-expansion
|
||||
# bind_password = '$__env{LDAP_BIND_PASSWORD}'
|
||||
|
||||
# Timeout in seconds (applies to each host specified in the 'host' entry (space separated))
|
||||
timeout = 10
|
||||
|
||||
# User search filter, for example "(cn=%s)" or "(sAMAccountName=%s)" or "(uid=%s)"
|
||||
search_filter = "(cn=%s)"
|
||||
|
||||
# An array of base dns to search through
|
||||
search_base_dns = ["dc=grafana,dc=org"]
|
||||
|
||||
## For Posix or LDAP setups that does not support member_of attribute you can define the below settings
|
||||
## Please check grafana LDAP docs for examples
|
||||
# group_search_filter = "(&(objectClass=posixGroup)(memberUid=%s))"
|
||||
# group_search_base_dns = ["ou=groups,dc=grafana,dc=org"]
|
||||
# group_search_filter_user_attribute = "uid"
|
||||
|
||||
# Specify names of the ldap attributes your ldap uses
|
||||
[servers.attributes]
|
||||
name = "givenName"
|
||||
surname = "sn"
|
||||
username = "cn"
|
||||
member_of = "memberOf"
|
||||
email = "email"
|
||||
|
||||
# Map ldap groups to grafana org roles
|
||||
[[servers.group_mappings]]
|
||||
group_dn = "cn=admins,ou=groups,dc=grafana,dc=org"
|
||||
org_role = "Admin"
|
||||
# To make user an instance admin (Grafana Admin) uncomment line below
|
||||
# grafana_admin = true
|
||||
# The Grafana organization database id, optional, if left out the default org (id 1) will be used
|
||||
# org_id = 1
|
||||
|
||||
[[servers.group_mappings]]
|
||||
group_dn = "cn=editors,ou=groups,dc=grafana,dc=org"
|
||||
org_role = "Editor"
|
||||
|
||||
[[servers.group_mappings]]
|
||||
# If you want to match all (or no ldap groups) then you can use wildcard
|
||||
group_dn = "*"
|
||||
org_role = "Viewer"
|
907
prometheus-grafana-provision/grafana/my_dashboards/envoy.json
Executable file
907
prometheus-grafana-provision/grafana/my_dashboards/envoy.json
Executable file
|
@ -0,0 +1,907 @@
|
|||
{
|
||||
"annotations": {
|
||||
"list": [
|
||||
{
|
||||
"builtIn": 1,
|
||||
"datasource": "-- Grafana --",
|
||||
"enable": true,
|
||||
"hide": true,
|
||||
"iconColor": "rgba(0, 211, 255, 1)",
|
||||
"name": "Annotations & Alerts",
|
||||
"target": {
|
||||
"limit": 100,
|
||||
"matchAny": false,
|
||||
"tags": [],
|
||||
"type": "dashboard"
|
||||
},
|
||||
"type": "dashboard"
|
||||
}
|
||||
]
|
||||
},
|
||||
"editable": true,
|
||||
"fiscalYearStartMonth": 0,
|
||||
"graphTooltip": 0,
|
||||
"id": 1,
|
||||
"iteration": 1647986935789,
|
||||
"links": [],
|
||||
"liveNow": false,
|
||||
"panels": [
|
||||
{
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"color": {
|
||||
"mode": "palette-classic"
|
||||
},
|
||||
"custom": {
|
||||
"axisLabel": "",
|
||||
"axisPlacement": "auto",
|
||||
"barAlignment": 0,
|
||||
"drawStyle": "line",
|
||||
"fillOpacity": 0,
|
||||
"gradientMode": "none",
|
||||
"hideFrom": {
|
||||
"legend": false,
|
||||
"tooltip": false,
|
||||
"viz": false
|
||||
},
|
||||
"lineInterpolation": "linear",
|
||||
"lineWidth": 1,
|
||||
"pointSize": 5,
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "auto",
|
||||
"spanNulls": false,
|
||||
"stacking": {
|
||||
"group": "A",
|
||||
"mode": "none"
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "off"
|
||||
}
|
||||
},
|
||||
"mappings": [],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green",
|
||||
"value": null
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 80
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"gridPos": {
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"id": 123125,
|
||||
"options": {
|
||||
"legend": {
|
||||
"calcs": [],
|
||||
"displayMode": "list",
|
||||
"placement": "bottom"
|
||||
},
|
||||
"tooltip": {
|
||||
"mode": "single",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"exemplar": true,
|
||||
"expr": "envoy_cluster_upstream_cx_connect_timeout{envoy_cluster_name=\"upstream\"}",
|
||||
"interval": "",
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"title": "Connection Timeout",
|
||||
"type": "timeseries"
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 6,
|
||||
"x": 12,
|
||||
"y": 0
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 3,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(delta(envoy_cluster_membership_change{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "membership changes [1m]",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_membership_total{envoy_cluster_name=~\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "total membershiip",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_outlier_detection_ejections_active{envoy_cluster_name=~\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "outlier ejections active",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_membership_healthy{envoy_cluster_name=~\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "healthy members (active HC and outlier)",
|
||||
"refId": "D"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_health_check_healthy{envoy_cluster_name=~\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "healthy members (active HC only)",
|
||||
"refId": "E"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "Cluster Membership",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 6,
|
||||
"x": 18,
|
||||
"y": 0
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 4,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null as zero",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_xx{envoy_response_code_class!=\"5\",envoy_cluster_name=~\"$envoy_cluster_name\"}[1m])) / sum(rate(envoy_cluster_upstream_rq_xx{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "Success Rate %",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "Success Rate (non-5xx responses)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "percentunit",
|
||||
"label": "",
|
||||
"logBase": 1,
|
||||
"max": "1",
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 8
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 16,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"exemplar": true,
|
||||
"expr": "envoy_http_ext_authz_denied {envoy_http_conn_manager_prefix=\"ingress_http\"}",
|
||||
"format": "time_series",
|
||||
"interval": "",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "Auth Denied",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 15
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 5,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null as zero",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_xx{envoy_response_code_class=\"4\",envoy_cluster_name=~\"$envoy_cluster_name\"}[1m])) / sum(rate(envoy_cluster_upstream_rq_xx{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "%",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "4xx %",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "percentunit",
|
||||
"logBase": 1,
|
||||
"max": "1",
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 22
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 7,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_retry{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"interval": "",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "request retry",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_retry_success{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "request retry success",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_retry_overflow{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "request retry overflow",
|
||||
"refId": "C"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "Upstream Request Retry Rate",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 6,
|
||||
"x": 0,
|
||||
"y": 29
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 1,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_cx_total{envoy_cluster_name=\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "egress CPS",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_total{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "egress RPS",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_upstream_rq_pending_total{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "pending req to",
|
||||
"refId": "C"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(rate(envoy_cluster_lb_healthy_panic{envoy_cluster_name=~\"$envoy_cluster_name\"}[1m]))",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "lb healthy panic RPS",
|
||||
"refId": "D"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "Egress CPS / RPS",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 7,
|
||||
"w": 6,
|
||||
"x": 6,
|
||||
"y": 29
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 2,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"options": {
|
||||
"alertThreshold": true
|
||||
},
|
||||
"percentage": false,
|
||||
"pluginVersion": "8.4.4",
|
||||
"pointradius": 5,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_upstream_cx_active{envoy_cluster_name=\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "connections",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_upstream_rq_active{envoy_cluster_name=\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "requests",
|
||||
"refId": "B"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"expr": "sum(envoy_cluster_upstream_rq_pending_active{envoy_cluster_name=\"$envoy_cluster_name\"})",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 2,
|
||||
"legendFormat": "pending",
|
||||
"refId": "C"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeRegions": [],
|
||||
"title": "Total Connections / Requests",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"mode": "time",
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"logBase": 1,
|
||||
"min": 0,
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false
|
||||
}
|
||||
}
|
||||
],
|
||||
"refresh": false,
|
||||
"schemaVersion": 35,
|
||||
"style": "dark",
|
||||
"tags": [],
|
||||
"templating": {
|
||||
"list": [
|
||||
{
|
||||
"allValue": ".+",
|
||||
"current": {
|
||||
"selected": false,
|
||||
"text": "upstream",
|
||||
"value": "upstream"
|
||||
},
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"name": "prometheus"
|
||||
},
|
||||
"definition": "label_values(envoy_cluster_version, envoy_cluster_name)",
|
||||
"hide": 0,
|
||||
"includeAll": false,
|
||||
"label": "Destination Service",
|
||||
"multi": false,
|
||||
"name": "envoy_cluster_name",
|
||||
"options": [],
|
||||
"query": "label_values(envoy_cluster_version, envoy_cluster_name)",
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"skipUrlSync": false,
|
||||
"sort": 1,
|
||||
"tagValuesQuery": "",
|
||||
"tagsQuery": "",
|
||||
"type": "query",
|
||||
"useTags": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"time": {
|
||||
"from": "2022-03-22T21:54:09.098Z",
|
||||
"to": "2022-03-22T21:56:54.439Z"
|
||||
},
|
||||
"timepicker": {
|
||||
"refresh_intervals": [
|
||||
"5s",
|
||||
"10s",
|
||||
"30s",
|
||||
"1m",
|
||||
"5m",
|
||||
"15m",
|
||||
"30m",
|
||||
"1h",
|
||||
"2h",
|
||||
"1d"
|
||||
],
|
||||
"time_options": [
|
||||
"5m",
|
||||
"15m",
|
||||
"1h",
|
||||
"6h",
|
||||
"12h",
|
||||
"24h",
|
||||
"2d",
|
||||
"7d",
|
||||
"30d"
|
||||
]
|
||||
},
|
||||
"timezone": "utc",
|
||||
"title": "Envoy Metrics",
|
||||
"uid": "E3LaT9Enz",
|
||||
"version": 1,
|
||||
"weekStart": ""
|
||||
}
|
3167
prometheus-grafana-provision/grafana/my_dashboards/postgres.json
Executable file
3167
prometheus-grafana-provision/grafana/my_dashboards/postgres.json
Executable file
File diff suppressed because it is too large
Load diff
49
prometheus-grafana-provision/grafana/my_dashboards/redis.json
Executable file
49
prometheus-grafana-provision/grafana/my_dashboards/redis.json
Executable file
|
@ -0,0 +1,49 @@
|
|||
|
||||
{
|
||||
"annotations": {
|
||||
"list": []
|
||||
},
|
||||
"editable": true,
|
||||
"graphTooltip": 0,
|
||||
"id": 2,
|
||||
"links": [],
|
||||
"panels": [
|
||||
{
|
||||
"datasource": "Prometheus",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"color": {
|
||||
"mode": "palette-classic"
|
||||
},
|
||||
"custom": {
|
||||
"axisLabel": "",
|
||||
"axisPlacement": "auto",
|
||||
"barAlignment": 0,
|
||||
"drawStyle": "line"
|
||||
},
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green",
|
||||
"value": null
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
"title": "Redis Memory Usage",
|
||||
"targets": [
|
||||
{
|
||||
"expr": "redis_memory_used_bytes",
|
||||
"legendFormat": "Memory Used"
|
||||
}
|
||||
],
|
||||
"type": "timeseries"
|
||||
}
|
||||
],
|
||||
"refresh": "5s",
|
||||
"schemaVersion": 38,
|
||||
"title": "Redis Dashboard",
|
||||
"version": 1
|
||||
}
|
68
prometheus-grafana-provision/grafana/provisioning/access_control/sample.yml
Executable file
68
prometheus-grafana-provision/grafana/provisioning/access_control/sample.yml
Executable file
|
@ -0,0 +1,68 @@
|
|||
# ---
|
||||
# # config file version
|
||||
# apiVersion: 2
|
||||
|
||||
# # <list> list of roles to insert/update/delete
|
||||
# roles:
|
||||
# # <string, required> name of the role you want to create or update. Required.
|
||||
# - name: 'custom:users:writer'
|
||||
# # <string> uid of the role. Has to be unique for all orgs.
|
||||
# uid: customuserswriter1
|
||||
# # <string> description of the role, informative purpose only.
|
||||
# description: 'Create, read, write users'
|
||||
# # <int> version of the role, Grafana will update the role when increased.
|
||||
# version: 2
|
||||
# # <int> org id. Defaults to Grafana's default if not specified.
|
||||
# orgId: 1
|
||||
# # <list> list of the permissions granted by this role.
|
||||
# permissions:
|
||||
# # <string, required> action allowed.
|
||||
# - action: 'users:read'
|
||||
# #<string> scope it applies to.
|
||||
# scope: 'global.users:*'
|
||||
# - action: 'users:write'
|
||||
# scope: 'global.users:*'
|
||||
# - action: 'users:create'
|
||||
# - name: 'custom:global:users:reader'
|
||||
# # <bool> overwrite org id and creates a global role.
|
||||
# global: true
|
||||
# # <string> state of the role. Defaults to 'present'. If 'absent', role will be deleted.
|
||||
# state: 'absent'
|
||||
# # <bool> force deletion revoking all grants of the role.
|
||||
# force: true
|
||||
# - uid: 'basic_editor'
|
||||
# version: 2
|
||||
# global: true
|
||||
# # <list> list of roles to copy permissions from.
|
||||
# from:
|
||||
# - uid: 'basic_editor'
|
||||
# global: true
|
||||
# - name: 'fixed:users:writer'
|
||||
# global: true
|
||||
# # <list> list of the permissions to add/remove on top of the copied ones.
|
||||
# permissions:
|
||||
# - action: 'users:read'
|
||||
# scope: 'global.users:*'
|
||||
# - action: 'users:write'
|
||||
# scope: 'global.users:*'
|
||||
# # <string> state of the permission. Defaults to 'present'. If 'absent', the permission will be removed.
|
||||
# state: absent
|
||||
|
||||
# # <list> list role assignments to teams to create or remove.
|
||||
# teams:
|
||||
# # <string, required> name of the team you want to assign roles to. Required.
|
||||
# - name: 'Users writers'
|
||||
# # <int> org id. Will default to Grafana's default if not specified.
|
||||
# orgId: 1
|
||||
# # <list> list of roles to assign to the team
|
||||
# roles:
|
||||
# # <string> uid of the role you want to assign to the team.
|
||||
# - uid: 'customuserswriter1'
|
||||
# # <int> org id. Will default to Grafana's default if not specified.
|
||||
# orgId: 1
|
||||
# # <string> name of the role you want to assign to the team.
|
||||
# - name: 'fixed:users:writer'
|
||||
# # <bool> overwrite org id to specify the role is global.
|
||||
# global: true
|
||||
# # <string> state of the assignment. Defaults to 'present'. If 'absent', the assignment will be revoked.
|
||||
# state: absent
|
217
prometheus-grafana-provision/grafana/provisioning/alerting/default.yml
Executable file
217
prometheus-grafana-provision/grafana/provisioning/alerting/default.yml
Executable file
|
@ -0,0 +1,217 @@
|
|||
# # config file version
|
||||
apiVersion: 1
|
||||
|
||||
# # List of rule groups to import or update
|
||||
# groups:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string, required> name of the rule group
|
||||
# name: my_rule_group
|
||||
# # <string, required> name of the folder the rule group will be stored in
|
||||
# folder: my_first_folder
|
||||
# # <duration, required> interval of the rule group evaluation
|
||||
# interval: 60s
|
||||
# # <list, required> list of rules that are part of the rule group
|
||||
# rules:
|
||||
# # <string, required> unique identifier for the rule. Should not exceed 40 symbols. Only letters, numbers, - (hyphen), and _ (underscore) allowed.
|
||||
# - uid: my_id_1
|
||||
# # <string, required> title of the rule, will be displayed in the UI
|
||||
# title: my_first_rule
|
||||
# # <string, required> query used for the condition
|
||||
# condition: A
|
||||
# # <list, required> list of query objects that should be executed on each
|
||||
# # evaluation - should be obtained via the API
|
||||
# data:
|
||||
# - refId: A
|
||||
# datasourceUid: "__expr__"
|
||||
# model:
|
||||
# conditions:
|
||||
# - evaluator:
|
||||
# params:
|
||||
# - 3
|
||||
# type: gt
|
||||
# operator:
|
||||
# type: and
|
||||
# query:
|
||||
# params:
|
||||
# - A
|
||||
# reducer:
|
||||
# type: last
|
||||
# type: query
|
||||
# datasource:
|
||||
# type: __expr__
|
||||
# uid: "__expr__"
|
||||
# expression: 1==0
|
||||
# intervalMs: 1000
|
||||
# maxDataPoints: 43200
|
||||
# refId: A
|
||||
# type: math
|
||||
# # <string> UID of a dashboard that the alert rule should be linked to
|
||||
# dashboardUid: my_dashboard
|
||||
# # <int> ID of the panel that the alert rule should be linked to
|
||||
# panelId: 123
|
||||
# # <string> state of the alert rule when no data is returned
|
||||
# # possible values: "NoData", "Alerting", "OK", default = NoData
|
||||
# noDataState: Alerting
|
||||
# # <string> state of the alert rule when the query execution
|
||||
# # fails - possible values: "Error", "Alerting", "OK"
|
||||
# # default = Alerting
|
||||
# executionErrorState: Alerting
|
||||
# # <duration, required> how long the alert condition should be breached before Firing. Before this time has elapsed, the alert is considered to be Pending
|
||||
# for: 60s
|
||||
# # <map<string, string>> map of strings to attach arbitrary custom data
|
||||
# annotations:
|
||||
# some_key: some_value
|
||||
# # <map<string, string> map of strings to filter and
|
||||
# # route alerts
|
||||
# labels:
|
||||
# team: sre_team_1
|
||||
# isPaused: false
|
||||
# # optional settings that let configure notification settings applied to alerts created by this rule
|
||||
# notification_settings:
|
||||
# # <string> name of the receiver (contact-point) that should be used for this route
|
||||
# receiver: grafana-default-email
|
||||
# # <list<string>> The labels by which incoming alerts are grouped together. For example,
|
||||
# # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
|
||||
# # be batched into a single group.
|
||||
# #
|
||||
# # To aggregate by all possible labels, use the special value '...' as
|
||||
# # the sole label name, for example:
|
||||
# # group_by: ['...']
|
||||
# # This effectively disables aggregation entirely, passing through all
|
||||
# # alerts as-is. This is unlikely to be what you want, unless you have
|
||||
# # a very low alert volume or your upstream notification system performs
|
||||
# # its own grouping.
|
||||
# # If defined, must contain the labels 'alertname' and 'grafana_folder', except when contains '...'
|
||||
# group_by: ["alertname", "grafana_folder", "region"]
|
||||
# # <list> Times when the route should be muted. These must match the name of a
|
||||
# # mute time interval.
|
||||
# # Additionally, the root node cannot have any mute times.
|
||||
# # When a route is muted it will not send any notifications, but
|
||||
# # otherwise acts normally (including ending the route-matching process
|
||||
# # if the `continue` option is not set)
|
||||
# mute_time_intervals:
|
||||
# - abc
|
||||
# # <duration> How long to initially wait to send a notification for a group
|
||||
# # of alerts. Allows to collect more initial alerts for the same group.
|
||||
# # (Usually ~0s to few minutes).
|
||||
# # If not specified, the corresponding setting of the default policy is used.
|
||||
# group_wait: 30s
|
||||
# # <duration> How long to wait before sending a notification about new alerts that
|
||||
# # are added to a group of alerts for which an initial notification has
|
||||
# # already been sent. (Usually ~5m or more).
|
||||
# # If not specified, the corresponding setting of the default policy is used.
|
||||
# group_interval: 5m
|
||||
# # <duration> How long to wait before sending a notification again if it has already
|
||||
# # been sent successfully for an alert. (Usually ~3h or more)
|
||||
# # If not specified, the corresponding setting of the default policy is used.
|
||||
# repeat_interval: 4h
|
||||
# # List of alert rule UIDs that should be deleted
|
||||
# deleteRules:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string, required> unique identifier for the rule
|
||||
# uid: my_id_1
|
||||
# # List of contact points to import or update
|
||||
# contactPoints:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string, required> name of the contact point
|
||||
# name: cp_1
|
||||
# receivers:
|
||||
# # <string, required> unique identifier for the receiver. Should not exceed 40 symbols. Only letters, numbers, - (hyphen), and _ (underscore) allowed.
|
||||
# - uid: first_uid
|
||||
# # <string, required> type of the receiver
|
||||
# type: prometheus-alertmanager
|
||||
# # <object, required> settings for the specific receiver type
|
||||
# settings:
|
||||
# url: http://test:9000
|
||||
# # List of receivers that should be deleted
|
||||
# deleteContactPoints:
|
||||
# - orgId: 1
|
||||
# uid: first_uid
|
||||
# # List of notification policies to import or update
|
||||
# policies:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string> name of the receiver that should be used for this route
|
||||
# receiver: grafana-default-email
|
||||
# # <list<string>> The labels by which incoming alerts are grouped together. For example,
|
||||
# # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
|
||||
# # be batched into a single group.
|
||||
# #
|
||||
# # To aggregate by all possible labels, use the special value '...' as
|
||||
# # the sole label name, for example:
|
||||
# # group_by: ['...']
|
||||
# # This effectively disables aggregation entirely, passing through all
|
||||
# # alerts as-is. This is unlikely to be what you want, unless you have
|
||||
# # a very low alert volume or your upstream notification system performs
|
||||
# # its own grouping.
|
||||
# group_by:
|
||||
# - grafana_folder
|
||||
# - alertname
|
||||
# # <list> a list of matchers that an alert has to fulfill to match the node
|
||||
# matchers:
|
||||
# - alertname = Watchdog
|
||||
# - severity =~ "warning|critical"
|
||||
# # <list> Times when the route should be muted. These must match the name of a
|
||||
# # mute time interval.
|
||||
# # Additionally, the root node cannot have any mute times.
|
||||
# # When a route is muted it will not send any notifications, but
|
||||
# # otherwise acts normally (including ending the route-matching process
|
||||
# # if the `continue` option is not set)
|
||||
# mute_time_intervals:
|
||||
# - abc
|
||||
# # <duration> How long to initially wait to send a notification for a group
|
||||
# # of alerts. Allows to collect more initial alerts for the same group.
|
||||
# # (Usually ~0s to few minutes), default = 30s
|
||||
# group_wait: 30s
|
||||
# # <duration> How long to wait before sending a notification about new alerts that
|
||||
# # are added to a group of alerts for which an initial notification has
|
||||
# # already been sent. (Usually ~5m or more), default = 5m
|
||||
# group_interval: 5m
|
||||
# # <duration> How long to wait before sending a notification again if it has already
|
||||
# # been sent successfully for an alert. (Usually ~3h or more), default = 4h
|
||||
# repeat_interval: 4h
|
||||
# # <list> Zero or more child routes
|
||||
# routes:
|
||||
# ...
|
||||
# # List of orgIds that should be reset to the default policy
|
||||
# resetPolicies:
|
||||
# - 1
|
||||
# # List of templates to import or update
|
||||
# templates:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgID: 1
|
||||
# # <string, required> name of the template, must be unique
|
||||
# name: my_first_template
|
||||
# # <string, required> content of the template
|
||||
# template: Alerting with a custome text template
|
||||
# # List of templates that should be deleted
|
||||
# deleteTemplates:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string, required> name of the template, must be unique
|
||||
# name: my_first_template
|
||||
# # List of mute time intervals to import or update
|
||||
# muteTimes:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string, required> name of the mute time interval, must be unique
|
||||
# name: mti_1
|
||||
# # <list> time intervals that should trigger the muting
|
||||
# refer to https://prometheus.io/docs/alerting/latest/configuration/#time_interval-0
|
||||
# time_intervals:
|
||||
# - times:
|
||||
# - start_time: '06:00'
|
||||
# end_time: '23:59'
|
||||
# weekdays: ['monday:wednesday','saturday', 'sunday']
|
||||
# months: ['1:3', 'may:august', 'december']
|
||||
# years: ['2020:2022', '2030']
|
||||
# days_of_month: ['1:5', '-3:-1']
|
||||
# # List of mute time intervals that should be deleted
|
||||
# deleteMuteTimes:
|
||||
# # <int> organization ID, default = 1
|
||||
# - orgId: 1
|
||||
# # <string, required> name of the mute time interval, must be unique
|
||||
# name: mti_1
|
25
prometheus-grafana-provision/grafana/provisioning/dashboards/default.yml
Executable file
25
prometheus-grafana-provision/grafana/provisioning/dashboards/default.yml
Executable file
|
@ -0,0 +1,25 @@
|
|||
# # config file version
|
||||
apiVersion: 1
|
||||
|
||||
providers:
|
||||
- name: 'default'
|
||||
orgId: 1
|
||||
folder: ''
|
||||
folderUid: ''
|
||||
type: file
|
||||
options:
|
||||
path: /home/grafana/dashboards/postgres.json
|
||||
- name: 'redis'
|
||||
orgId: 1
|
||||
folder: ''
|
||||
folderUid: ''
|
||||
type: file
|
||||
options:
|
||||
path: /home/grafana/dashboards/redis.json
|
||||
- name: 'enoy'
|
||||
orgId: 1
|
||||
folder: ''
|
||||
folderUid: ''
|
||||
type: file
|
||||
options:
|
||||
path: /home/grafana/dashboards/envoy.json
|
|
@ -0,0 +1,8 @@
|
|||
apiVersion: 1
|
||||
datasources:
|
||||
- name: Prometheus
|
||||
type: prometheus
|
||||
url: http://200.0.0.10:9090
|
||||
isDefault: true
|
||||
access: proxy
|
||||
editable: true
|
71
prometheus-grafana-provision/grafana/provisioning/datasources/sample.yml
Executable file
71
prometheus-grafana-provision/grafana/provisioning/datasources/sample.yml
Executable file
|
@ -0,0 +1,71 @@
|
|||
# Configuration file version
|
||||
apiVersion: 1
|
||||
|
||||
# # List of data sources to delete from the database.
|
||||
# deleteDatasources:
|
||||
# - name: Graphite
|
||||
# orgId: 1
|
||||
|
||||
# # List of data sources to insert/update depending on what's
|
||||
# # available in the database.
|
||||
# datasources:
|
||||
# # <string, required> Sets the name you use to refer to
|
||||
# # the data source in panels and queries.
|
||||
# - name: Graphite
|
||||
# # <string, required> Sets the data source type.
|
||||
# type: graphite
|
||||
# # <string, required> Sets the access mode, either
|
||||
# # proxy or direct (Server or Browser in the UI).
|
||||
# # Some data sources are incompatible with any setting
|
||||
# # but proxy (Server).
|
||||
# access: proxy
|
||||
# # <int> Sets the organization id. Defaults to orgId 1.
|
||||
# orgId: 1
|
||||
# # <string> Sets a custom UID to reference this
|
||||
# # data source in other parts of the configuration.
|
||||
# # If not specified, Grafana generates one.
|
||||
# uid: my_unique_uid
|
||||
# # <string> Sets the data source's URL, including the
|
||||
# # port.
|
||||
# url: http://localhost:8080
|
||||
# # <string> Sets the database user, if necessary.
|
||||
# user:
|
||||
# # <string> Sets the database name, if necessary.
|
||||
# database:
|
||||
# # <bool> Enables basic authorization.
|
||||
# basicAuth:
|
||||
# # <string> Sets the basic authorization username.
|
||||
# basicAuthUser:
|
||||
# # <bool> Enables credential headers.
|
||||
# withCredentials:
|
||||
# # <bool> Toggles whether the data source is pre-selected
|
||||
# # for new panels. You can set only one default
|
||||
# # data source per organization.
|
||||
# isDefault:
|
||||
# # <map> Fields to convert to JSON and store in jsonData.
|
||||
# jsonData:
|
||||
# # <string> Defines the Graphite service's version.
|
||||
# graphiteVersion: '1.1'
|
||||
# # <bool> Enables TLS authentication using a client
|
||||
# # certificate configured in secureJsonData.
|
||||
# tlsAuth: true
|
||||
# # <bool> Enables TLS authentication using a CA
|
||||
# # certificate.
|
||||
# tlsAuthWithCACert: true
|
||||
# # <map> Fields to encrypt before storing in jsonData.
|
||||
# secureJsonData:
|
||||
# # <string> Defines the CA cert, client cert, and
|
||||
# # client key for encrypted authentication.
|
||||
# tlsCACert: '...'
|
||||
# tlsClientCert: '...'
|
||||
# tlsClientKey: '...'
|
||||
# # <string> Sets the database password, if necessary.
|
||||
# password:
|
||||
# # <string> Sets the basic authorization password.
|
||||
# basicAuthPassword:
|
||||
# # <int> Sets the version. Used to compare versions when
|
||||
# # updating. Ignored when creating a new data source.
|
||||
# version: 1
|
||||
# # <bool> Allows users to edit data sources from the
|
||||
# # Grafana UI.
|
||||
# editable: false
|
2000
prometheus-grafana-provision/grafana/sample.ini
Executable file
2000
prometheus-grafana-provision/grafana/sample.ini
Executable file
File diff suppressed because it is too large
Load diff
8
prometheus-grafana-provision/postgres/postgres.env
Normal file
8
prometheus-grafana-provision/postgres/postgres.env
Normal file
|
@ -0,0 +1,8 @@
|
|||
|
||||
POSTGRES_USER=pjhub_user
|
||||
POSTGRES_PASSWORD=secure_password_here
|
||||
POSTGRES_DB=pjhub_db
|
||||
POSTGRES_PORT=5432
|
||||
POSTGRES_HOST=postgres
|
||||
POSTGRES_MAX_CONNECTIONS=100
|
||||
POSTGRES_POOL_SIZE=20
|
17
prometheus-grafana-provision/prometheus/prometheus.yaml
Executable file
17
prometheus-grafana-provision/prometheus/prometheus.yaml
Executable file
|
@ -0,0 +1,17 @@
|
|||
global:
|
||||
scrape_interval: 15s
|
||||
evaluation_interval: 15s
|
||||
scrape_configs:
|
||||
- job_name: 'envoy'
|
||||
metrics_path: /stats/prometheus
|
||||
static_configs:
|
||||
- targets: ['envoy:19000']
|
||||
labels:
|
||||
group: 'envoy'
|
||||
- job_name: postgresql
|
||||
static_configs:
|
||||
- targets: ['postgresql-exporter:9187']
|
||||
|
||||
- job_name: redis_exporter
|
||||
static_configs:
|
||||
- targets: ['redis-exporter:9121']
|
5
prometheus-grafana-provision/redis/Dockerfile.redis
Normal file
5
prometheus-grafana-provision/redis/Dockerfile.redis
Normal file
|
@ -0,0 +1,5 @@
|
|||
FROM redis:alpine
|
||||
|
||||
ENV REDIS_PASSWORD your_password
|
||||
|
||||
COPY redis.conf /etc/redis.conf
|
3
prometheus-grafana-provision/redis/redis.conf
Normal file
3
prometheus-grafana-provision/redis/redis.conf
Normal file
|
@ -0,0 +1,3 @@
|
|||
port 6379
|
||||
bind 0.0.0.0
|
||||
requirepass your_password
|
7
prometheus-grafana-provision/redis/redis.env
Normal file
7
prometheus-grafana-provision/redis/redis.env
Normal file
|
@ -0,0 +1,7 @@
|
|||
|
||||
REDIS_PASSWORD=secure_redis_password
|
||||
REDIS_PORT=6379
|
||||
REDIS_HOST=redis
|
||||
REDIS_DB=0
|
||||
REDIS_MAX_MEMORY=2gb
|
||||
REDIS_MAX_MEMORY_POLICY=allkeys-lru
|
Loading…
Add table
Add a link
Reference in a new issue