Grafana Dashboards¶
Introduction¶
Grafana fournit la visualisation des métriques Prometheus via des dashboards interactifs. Des dashboards pré-configurés pour OpenStack permettent un monitoring immédiat.
Prérequis¶
- Prometheus Stack installé
- Exporters configurés
Points à apprendre¶
Accès à Grafana¶
# URL via HAProxy
http://10.0.0.10:3000
# Credentials par défaut Kolla
# User: admin
# Password: dans /etc/kolla/passwords.yml → grafana_admin_password
# Récupérer le mot de passe
grep grafana_admin_password /etc/kolla/passwords.yml
Architecture des dashboards¶
graph TB
prometheus[Prometheus<br/>Data source]
subgraph Grafana
home[Home<br/>Vue d'ensemble]
subgraph Infrastructure
nodes[Nodes Overview<br/>CPU, RAM, Disk<br/>par host]
docker[Docker Containers<br/>Container metrics]
network[Network<br/>Traffic, errors]
end
subgraph OpenStack
overview[OpenStack Overview<br/>VMs, volumes, networks]
nova[Nova Compute<br/>Instances, hypervisors]
neutron[Neutron<br/>Networks, routers, ports]
cinder[Cinder<br/>Volumes, snapshots]
end
subgraph Data_Layer["Data Layer"]
galera[Galera Cluster<br/>Cluster state, queries]
rabbit[RabbitMQ<br/>Queues, connections]
haproxy[HAProxy<br/>Frontend/backend stats]
end
subgraph Storage
ceph[Ceph Cluster<br/>OSD, pools, PGs]
end
end
home -.->|Query PromQL| prometheus
nodes -.->|Query PromQL| prometheus
overview -.->|Query PromQL| prometheus
galera -.->|Query PromQL| prometheus
ceph -.->|Query PromQL| prometheus
Configuration Data Source¶
# /etc/kolla/config/grafana/provisioning/datasources/prometheus.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: false
- name: Alertmanager
type: alertmanager
access: proxy
url: http://alertmanager:9093
jsonData:
implementation: prometheus
Dashboard Node Overview¶
{
"title": "Node Overview",
"panels": [
{
"title": "CPU Usage",
"type": "timeseries",
"targets": [
{
"expr": "100 - (avg by(instance) (rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
"legendFormat": "{{instance}}"
}
]
},
{
"title": "Memory Usage",
"type": "gauge",
"targets": [
{
"expr": "(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100",
"legendFormat": "{{instance}}"
}
],
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 70},
{"color": "red", "value": 90}
]
}
},
{
"title": "Disk Usage",
"type": "bargauge",
"targets": [
{
"expr": "(node_filesystem_size_bytes{mountpoint=\"/\"} - node_filesystem_avail_bytes{mountpoint=\"/\"}) / node_filesystem_size_bytes{mountpoint=\"/\"} * 100",
"legendFormat": "{{instance}}"
}
]
}
]
}
Dashboard OpenStack Overview¶
{
"title": "OpenStack Overview",
"panels": [
{
"title": "Total Instances",
"type": "stat",
"targets": [
{"expr": "openstack_nova_total_vms"}
]
},
{
"title": "Running Instances",
"type": "stat",
"targets": [
{"expr": "openstack_nova_running_vms"}
]
},
{
"title": "vCPU Usage",
"type": "gauge",
"targets": [
{"expr": "openstack_nova_vcpus_used / openstack_nova_vcpus_total * 100"}
]
},
{
"title": "Memory Usage",
"type": "gauge",
"targets": [
{"expr": "openstack_nova_memory_used_bytes / openstack_nova_memory_total_bytes * 100"}
]
},
{
"title": "Volumes",
"type": "stat",
"targets": [
{"expr": "openstack_cinder_volumes"}
]
},
{
"title": "Networks",
"type": "stat",
"targets": [
{"expr": "openstack_neutron_networks"}
]
}
]
}
Dashboard Galera Cluster¶
{
"title": "Galera Cluster",
"panels": [
{
"title": "Cluster Size",
"type": "stat",
"targets": [
{"expr": "mysql_global_status_wsrep_cluster_size"}
],
"thresholds": {
"steps": [
{"color": "red", "value": 0},
{"color": "yellow", "value": 2},
{"color": "green", "value": 3}
]
}
},
{
"title": "Node Status",
"type": "table",
"targets": [
{"expr": "mysql_global_status_wsrep_local_state_comment", "format": "table"}
]
},
{
"title": "Queries per Second",
"type": "timeseries",
"targets": [
{"expr": "rate(mysql_global_status_queries[5m])", "legendFormat": "{{instance}}"}
]
},
{
"title": "Connections",
"type": "timeseries",
"targets": [
{"expr": "mysql_global_status_threads_connected", "legendFormat": "{{instance}}"}
]
}
]
}
Dashboard Ceph¶
{
"title": "Ceph Cluster",
"panels": [
{
"title": "Health Status",
"type": "stat",
"targets": [
{"expr": "ceph_health_status"}
],
"mappings": [
{"type": "value", "options": {"0": {"text": "HEALTH_OK", "color": "green"}}},
{"type": "value", "options": {"1": {"text": "HEALTH_WARN", "color": "yellow"}}},
{"type": "value", "options": {"2": {"text": "HEALTH_ERR", "color": "red"}}}
]
},
{
"title": "OSDs Up/In",
"type": "stat",
"targets": [
{"expr": "count(ceph_osd_up == 1)"},
{"expr": "count(ceph_osd_in == 1)"}
]
},
{
"title": "Cluster Capacity",
"type": "gauge",
"targets": [
{"expr": "ceph_cluster_total_used_bytes / ceph_cluster_total_bytes * 100"}
]
},
{
"title": "IOPS",
"type": "timeseries",
"targets": [
{"expr": "sum(rate(ceph_osd_op_r[5m]))", "legendFormat": "Read"},
{"expr": "sum(rate(ceph_osd_op_w[5m]))", "legendFormat": "Write"}
]
}
]
}
Import de dashboards¶
# Dashboards populaires pour OpenStack sur Grafana.com
# Node Exporter Full
# ID: 1860
# Docker and system monitoring
# ID: 893
# HAProxy
# ID: 367
# RabbitMQ
# ID: 10991
# MySQL/MariaDB
# ID: 7362
# Ceph Cluster
# ID: 2842
# Import via API
curl -X POST \
-H "Content-Type: application/json" \
-d '{
"dashboard": {"id": null, "title": "My Dashboard"},
"folderId": 0,
"overwrite": true
}' \
http://admin:password@10.0.0.10:3000/api/dashboards/import
Provisioning automatique¶
# /etc/kolla/config/grafana/provisioning/dashboards/default.yml
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
folderUid: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards
# Placer les JSON des dashboards dans:
# /etc/kolla/config/grafana/dashboards/
Variables de dashboard¶
{
"templating": {
"list": [
{
"name": "host",
"type": "query",
"query": "label_values(node_uname_info, instance)",
"refresh": 1
},
{
"name": "job",
"type": "query",
"query": "label_values(up, job)",
"refresh": 1
},
{
"name": "interval",
"type": "interval",
"options": [
{"text": "1m", "value": "1m"},
{"text": "5m", "value": "5m"},
{"text": "15m", "value": "15m"},
{"text": "1h", "value": "1h"}
]
}
]
}
}
Exemples pratiques¶
Dashboard de synthèse¶
{
"title": "OpenStack Executive Summary",
"rows": [
{
"title": "Infrastructure Health",
"panels": [
{"title": "API Availability", "expr": "avg(probe_success{job=\"blackbox-http\"}) * 100"},
{"title": "Services Up", "expr": "count(up == 1)"},
{"title": "Active Alerts", "expr": "count(ALERTS{alertstate=\"firing\"})"}
]
},
{
"title": "Resource Utilization",
"panels": [
{"title": "Compute Used", "expr": "sum(openstack_nova_vcpus_used) / sum(openstack_nova_vcpus_total) * 100"},
{"title": "Memory Used", "expr": "sum(openstack_nova_memory_used_bytes) / sum(openstack_nova_memory_total_bytes) * 100"},
{"title": "Storage Used", "expr": "ceph_cluster_total_used_bytes / ceph_cluster_total_bytes * 100"}
]
}
]
}
Annotations pour les incidents¶
{
"annotations": {
"list": [
{
"name": "Deployments",
"datasource": "Prometheus",
"expr": "changes(kolla_version_info[5m]) > 0",
"tagKeys": "version",
"titleFormat": "Deployment"
},
{
"name": "Alerts",
"datasource": "Prometheus",
"expr": "ALERTS{alertstate=\"firing\"}",
"tagKeys": "alertname",
"titleFormat": "Alert: {{alertname}}"
}
]
}
}
Ressources¶
Checkpoint¶
- Grafana accessible avec credentials
- Data source Prometheus configuré
- Dashboard Node Overview fonctionnel
- Dashboard OpenStack Overview
- Dashboard Galera Cluster
- Dashboard Ceph
- Variables de template utilisées
- Dashboards provisionnés automatiquement