Self-Hosting Grafana and Prometheus: Build Your Own Monitoring Stack

Monitoring 2026-02-08 grafana prometheus monitoring observability alerting

Datadog costs $15/host/month. New Relic has a free tier until you actually need it. Cloud monitoring gets expensive fast — especially when all you want is to know when something breaks and see some nice graphs.

Grafana and Prometheus together form the most popular open source monitoring stack in the world. Prometheus collects and stores metrics. Grafana visualizes them. Both are free, self-hosted, and used in production by organizations from startups to the largest tech companies.

The Stack at a Glance

The monitoring stack has a few components, each with a clear role:

Component	Role	Analogy
Prometheus	Metrics collection and storage	The database
Grafana	Visualization and dashboards	The UI
Alertmanager	Alert routing and notifications	The pager
Node Exporter	System metrics (CPU, RAM, disk)	The sensor
Exporters	App-specific metrics	More sensors

How it works: Prometheus scrapes metrics from your servers and applications on a schedule (typically every 15-30 seconds). Grafana queries Prometheus and renders dashboards. Alertmanager fires notifications when metrics cross thresholds.

Self-Hosted vs. Paid Monitoring

Feature	Datadog / New Relic	Grafana + Prometheus
Cost per host	$15-23/month	Free
Data retention	Varies by plan	You control it
Setup time	Minutes	1-2 hours
Maintenance	Zero	Some (updates, storage)
Custom dashboards	Yes	Yes (more flexible)
APM / Tracing	Built-in	Separate tools needed
Log management	Built-in	Add Loki
Hosted option	Yes (it's the product)	Grafana Cloud free tier

When paid monitoring is the better choice

You have no one to maintain infrastructure — SaaS monitoring is fire-and-forget.
You need APM (Application Performance Monitoring) — Distributed tracing across microservices is complex to self-host.
Compliance requirements — Some regulations require specific audit trails that SaaS providers handle.
You're a large team — At scale, the time cost of maintaining monitoring infrastructure may exceed SaaS fees.

When self-hosting wins

You're cost-sensitive — Monitoring 5-10 servers with Datadog costs $75-150/month. Self-hosted costs $0.
You want full control — Your data stays on your infrastructure.
You need long retention — Store years of metrics without per-GB fees.
You're already running servers — Adding Prometheus to existing infrastructure is low marginal effort.

Setting It Up

Docker Compose deployment

services:
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.retention.time=90d'
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      GF_SECURITY_ADMIN_PASSWORD: changeme
    restart: unless-stopped

  node-exporter:
    image: prom/node-exporter:latest
    ports:
      - "9100:9100"
    pid: host
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
      - '--path.rootfs=/rootfs'
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:

Prometheus configuration

Create prometheus.yml:

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node'
    static_configs:
      - targets: ['node-exporter:9100']

docker compose up -d

Prometheus is now available at http://your-server:9090, Grafana at http://your-server:3000.

Connecting Grafana to Prometheus

Open Grafana (http://your-server:3000, login with admin/changeme)
Go to Configuration → Data Sources → Add data source
Select Prometheus
Set URL to http://prometheus:9090
Click Save & Test

Your first dashboard

Don't build dashboards from scratch. Import a community dashboard:

In Grafana, go to Dashboards → Import
Enter dashboard ID 1860 (Node Exporter Full)
Select your Prometheus data source
Click Import

You'll immediately see CPU usage, memory, disk I/O, network traffic, and dozens of other system metrics with professional-looking graphs.

Setting Up Alerts

Monitoring without alerting is just logging with extra steps. Here's how to get notified when things go wrong.

Prometheus alerting rules

Create alert-rules.yml:

groups:
  - name: system
    rules:
      - alert: HighCPU
        expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"

      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 < 15
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Disk space below 15% on {{ $labels.instance }}"

      - alert: HighMemory
        expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Memory usage above 90% on {{ $labels.instance }}"

Alertmanager for notifications

Alertmanager routes alerts to your preferred notification channel: email, Slack, PagerDuty, Telegram, or webhooks.

# alertmanager.yml
route:
  receiver: 'email'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: 'email'
    email_configs:
      - to: '[email protected]'
        from: '[email protected]'
        smarthost: 'smtp.yourdomain.com:587'

Monitoring Beyond System Metrics

Prometheus can monitor almost anything through exporters:

Databases: postgres_exporter, mysqld_exporter, redis_exporter
Web servers: nginx-exporter, apache_exporter
Containers: cAdvisor (built into Docker)
Applications: Most modern apps expose Prometheus metrics natively
Network: SNMP exporter, blackbox_exporter (HTTP/TCP/ICMP probes)
Hardware: IPMI exporter for server hardware health

Each exporter exposes metrics that Prometheus scrapes automatically.

Storage and Retention

Prometheus stores data efficiently using its custom time-series database (TSDB):

1,000 metrics at 15s intervals uses roughly 1-2 GB per month
Default retention: 15 days
Recommended: 90 days for most setups (--storage.tsdb.retention.time=90d)

For longer retention, consider Thanos or VictoriaMetrics as a long-term storage backend.

The Honest Trade-offs

Grafana + Prometheus is great if:

You're monitoring servers, containers, or applications you control
You want beautiful dashboards without per-host fees
You need flexible alerting with custom thresholds
You want to learn the industry-standard monitoring stack

It's not ideal if:

You need zero-maintenance monitoring (SaaS is truly hands-off)
You need distributed tracing across microservices (add Jaeger/Tempo separately)
You only have one server and don't want to run additional services on it

Bottom line: Grafana + Prometheus is the monitoring stack that most production infrastructure runs. The setup takes an hour or two, and you get enterprise-grade monitoring for free. If you're paying more than $20/month for cloud monitoring, self-hosting will pay for itself immediately.

Resources

Prometheus documentation
Grafana documentation
Awesome Prometheus alerts — pre-built alert rules
Grafana dashboard library — thousands of community dashboards