Observability Stack

A complete, production-ready observability stack with Prometheus, Grafana, Node Exporter, and Alert Manager. Deploy in 5 minutes with one command.

Features

Real-time metrics – CPU, memory, disk, and network monitoring
Beautiful dashboards – Pre-configured Grafana dashboard (ID: 1860)
Smart alerts – Get notified when CPU >80%, memory >90%, or disk <10%
Easy deployment – Single command with Docker Compose
Makefile automation – Simplified management commands

Requirements

Docker Engine 20.10+
Docker Compose 2.0+
Linux server (or WSL2 on Windows)

Quick Start

git clone https://github.com/irvaniamirali/observability-stack.git
cd observability-stack
make up

Screenshots

Node Exporter dashboard showing CPU, memory, and disk metrics

Access Services

Service	URL	Credentials
Prometheus	http://localhost:9090	-
Grafana	http://localhost:3000	admin / admin
Alert Manager	http://localhost:9093	-
Node Exporter	http://localhost:9100/metrics	-

Available Commands

Command	Description
make up	Start all services
make down	Stop all services
make restart	Restart all services
make logs	View all logs
make status	Check service status
make clean	Stop and remove all data
make backnewup	Backup Grafana dashboards

Alerts Configured

Alert	Condition	Severity
High CPU Usage	>80% for 2 minutes	Warning
High Memory Usage	>90% for 2 minutes	Critical
Low Disk Space	<10% available	Warning

Setup Grafana Data Source

After running make up:

Open http://localhost:3000 (admin/admin)
Go to Connections -> Data sources -> Add data source
Select Prometheus
Set URL to: http://prometheus:9090
Click Save & test

Import Dashboard

In Grafana, go to Dashboards -> Create dashboard -> Import dashboard
Enter dashboard ID: 1860
Click Load
Select your Prometheus data source
Click Import

Test Alerts

Simulate high CPU usage to test alerts:

Install stress tool (if not installed)

# Debian/Ubuntu
sudo apt update && sudo apt install stress -y

# CentOS/RHEL
sudo yum install epel-release -y && sudo yum install stress -y

Run CPU stress test

Run stress test on all CPU cores for 3 minutes

stress --cpu $(nproc) --timeout 180

Check alerts

Open Prometheus Alerts page: http://localhost:9090/alerts
After approximately 2 minutes, HighCPUUsage alert will change from PENDING to FIRING
Check Alert Manager: http://localhost:9093

Stop stress test (if needed before timeout)

sudo pkill stress

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
alertmanager		alertmanager
images		images
prometheus		prometheus
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Observability Stack

Features

Requirements

Quick Start

Screenshots

Access Services

Available Commands

Alerts Configured

Setup Grafana Data Source

Import Dashboard

Test Alerts

Install stress tool (if not installed)

Run CPU stress test

Check alerts

Stop stress test (if needed before timeout)

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Observability Stack

Features

Requirements

Quick Start

Screenshots

Access Services

Available Commands

Alerts Configured

Setup Grafana Data Source

Import Dashboard

Test Alerts

Install stress tool (if not installed)

Run CPU stress test

Check alerts

Stop stress test (if needed before timeout)

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages