Files
EZ-Homelab/docs/service-docs/alertmanager.md
kelinfoxy cba45556c7 Complete documentation refactoring with 57 service docs
- Refactored README.md, getting-started.md, quick-reference.md
- Enhanced setup-homelab.sh with 9-step automated process
- Created services-overview.md with all stacks
- Added comprehensive documentation for 57 services in docs/service-docs/
- All services include: overview, configuration, resources, educational content
- Coverage: Core, Infrastructure, Dashboards, Media, Media-Extended, Home Assistant, Productivity, Utilities, Monitoring, Development stacks
- Educational focus with links to tutorials, videos, and guides
2026-01-12 18:03:12 -05:00

210 lines
5.6 KiB
Markdown

# Alertmanager - Alert Routing
## Table of Contents
- [Overview](#overview)
- [What is Alertmanager?](#what-is-alertmanager)
- [Why Use Alertmanager?](#why-use-alertmanager)
- [Configuration in AI-Homelab](#configuration-in-ai-homelab)
- [Official Resources](#official-resources)
- [Docker Configuration](#docker-configuration)
## Overview
**Category:** Alert Management
**Docker Image:** [prom/alertmanager](https://hub.docker.com/r/prom/alertmanager)
**Default Stack:** `monitoring.yml`
**Web UI:** `http://SERVER_IP:9093`
**Purpose:** Handle Prometheus alerts
**Ports:** 9093
## What is Alertmanager?
Alertmanager handles alerts from Prometheus. It deduplicates, groups, and routes alerts to notification channels (email, Slack, PagerDuty, etc.). It also manages silencing and inhibition of alerts. The alerting component of the Prometheus ecosystem.
### Key Features
- **Alert Routing:** Send to right channels
- **Grouping:** Combine similar alerts
- **Deduplication:** No duplicate alerts
- **Silencing:** Mute alerts temporarily
- **Inhibition:** Suppress dependent alerts
- **Notifications:** Email, Slack, webhooks, etc.
- **Web UI:** Manage alerts visually
- **Free & Open Source:** Prometheus project
## Why Use Alertmanager?
1. **Prometheus Native:** Designed for Prometheus
2. **Smart Routing:** Alerts go where needed
3. **Deduplication:** No spam
4. **Grouping:** Related alerts together
5. **Silencing:** Maintenance mode
6. **Multi-Channel:** Email, Slack, etc.
## Configuration in AI-Homelab
```
/opt/stacks/monitoring/alertmanager/
alertmanager.yml # Configuration
data/ # Alert state
```
### alertmanager.yml
```yaml
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'discord'
receivers:
- name: 'discord'
webhook_configs:
- url: 'YOUR_DISCORD_WEBHOOK_URL'
send_resolved: true
- name: 'email'
email_configs:
- to: 'alerts@yourdomain.com'
from: 'alertmanager@yourdomain.com'
smarthost: 'smtp.gmail.com:587'
auth_username: 'your@gmail.com'
auth_password: 'app_password'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'instance']
```
## Official Resources
- **Website:** https://prometheus.io/docs/alerting/latest/alertmanager
- **Configuration:** https://prometheus.io/docs/alerting/latest/configuration
## Docker Configuration
```yaml
alertmanager:
image: prom/alertmanager:latest
container_name: alertmanager
restart: unless-stopped
networks:
- traefik-network
ports:
- "9093:9093"
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
- '--storage.path=/alertmanager'
volumes:
- /opt/stacks/monitoring/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml
- /opt/stacks/monitoring/alertmanager/data:/alertmanager
```
## Setup
1. **Configure Prometheus:**
Add to prometheus.yml:
```yaml
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- '/etc/prometheus/rules/*.yml'
```
2. **Create Alert Rules:**
`/opt/stacks/monitoring/prometheus/rules/alerts.yml`:
```yaml
groups:
- name: example
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} has been down for more than 5 minutes."
- alert: HighCPU
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage is above 80% for more than 5 minutes."
- alert: HighMemory
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage on {{ $labels.instance }}"
description: "Memory usage is above 90%."
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 < 10
for: 5m
labels:
severity: critical
annotations:
summary: "Low disk space on {{ $labels.instance }}"
description: "Disk space is below 10%."
```
3. **Restart Prometheus:**
```bash
docker restart prometheus
```
4. **Access Alertmanager UI:** `http://SERVER_IP:9093`
## Summary
Alertmanager routes alerts from Prometheus offering:
- Alert deduplication
- Grouping and routing
- Multiple notification channels
- Silencing and inhibition
- Web UI management
- Free and open-source
**Perfect for:**
- Prometheus alert handling
- Multi-channel notifications
- Alert management
- Maintenance silencing
- Alert grouping
**Key Points:**
- Receives alerts from Prometheus
- Routes to notification channels
- Deduplicates and groups
- Supports silencing
- Web UI for management
- Configure in alertmanager.yml
- Define rules in Prometheus
**Remember:**
- Configure receivers (Discord, Email, etc.)
- Create alert rules in Prometheus
- Test alerts work
- Use silencing for maintenance
- Group related alerts
- Set appropriate thresholds
- Monitor alertmanager itself
Alertmanager manages your alerts intelligently!