Alert Rules
Metric-based alert definitions. Each rule has a metric, condition, threshold, and evaluation window.
Total Rules
configured
Enabled
1 disabled
Avg Noise Score
across all rules
Triggered (7d)
total firings
Alert Rules
Toggle to enable/disable. Click edit to modify threshold or condition.
Ref | Name | Service | Metric | Condition | Severity | Last Triggered | 7d Count | Noise Score | Enabled | |
|---|---|---|---|---|---|---|---|---|---|---|
ALERT-009 | Host CPU > 90% Fires when any host CPU exceeds 90% for 10 minutes. | api-gateway | cpu_usage | > 90 (10m) | Low | Never | 12 | 87 | ||
ALERT-010 | Disk usage > 85% Fires when any disk usage exceeds 85%. | api-gateway | disk_usage | > 85 (5m) | Low | Never | 4 | 21 | ||
ALERT-001 | Checkout 5xx > 1% Fires when checkout-api 5xx rate exceeds 1% for 5 minutes. | checkout-api | http_5xx_rate | > 1 (5m) | High | Jul 2, 05:12 AM | 3 | 12 | ||
ALERT-007 | API Gateway p99 > 200ms Fires when API gateway p99 latency exceeds 200ms. | api-gateway | p99_latency | > 200 (5m) | Medium | Never | 2 | 8 | ||
ALERT-002 | Invoice queue depth > 10k Fires when invoice worker queue depth exceeds 10k. | billing-service | queue_depth | > 10000 (5m) | Medium | Jul 1, 10:42 PM | 1 | 4 | ||
ALERT-003 | Auth 401 rate > 3% Fires when auth-service 401 rate exceeds 3%. | auth-service | http_401_rate | > 3 (5m) | Low | Jul 1, 02:02 PM | 1 | 2 | ||
ALERT-004 | Web LCP > 2.5s Fires when web-app LCP p95 exceeds 2.5s. | web-app | lcp_p95 | > 2.5 (10m) | Medium | Jul 1, 09:30 AM | 1 | 3 | ||
ALERT-005 | PostgreSQL replication lag > 5s Fires when PG replication lag exceeds 5 seconds. | postgres-primary | repl_lag | > 5 (2m) | High | Never | 0 | 0 | ||
ALERT-006 | Kafka consumer lag > 50k Fires when any consumer group lag exceeds 50k. | kafka-bus | consumer_lag | > 50000 (5m) | Medium | Never | 0 | 0 | ||
ALERT-008 | Redis memory > 85% Fires when Redis memory usage exceeds 85%. | redis-cluster | memory_usage | > 85 (5m) | Medium | Never | 0 | 0 |