Alerts
Configure alert rules, notification channels, and escalation policies.
Overview
Alerts notify your team when conditions are met — pipeline failures, data quality issues, agent disconnects, or custom metric thresholds.
Alert Rules
Navigate to Operate > Alerts to manage rules.
Creating a Rule
- Click Create Alert Rule
- Define the condition (e.g., “pipeline X has been stopped for > 5 minutes”)
- Set severity: info, warning, critical
- Choose notification channels
- Configure cooldown period
- Save
Condition Types
| Type | Example |
|---|---|
| Metric threshold | CPU usage > 90% for 5 minutes |
| Pipeline status | Pipeline stopped or errored |
| Agent health | Agent offline for > 2 minutes |
| Data quality | Quality rule failure |
| Custom SQL | Result of a query exceeds threshold |
Notification Channels
| Channel | Configuration |
|---|---|
| Slack | Webhook URL, channel name |
| Recipient addresses | |
| PagerDuty | Integration key, severity mapping |
| Webhook | URL, headers, payload template |
| Microsoft Teams | Incoming webhook URL |
Adding a Channel
- Go to Alerts > Channels
- Click Add Channel
- Select the type and fill in configuration
- Send a test notification to verify
- Save
Escalation Policies
For critical alerts, define escalation:
- Create an escalation policy with ordered steps
- Each step specifies a channel and a delay (e.g., “if not acknowledged in 15 minutes, notify PagerDuty”)
- Attach the policy to alert rules
Alert History
View all fired alerts with:
| Column | Description |
|---|---|
| Rule | Which rule triggered |
| Severity | Info, warning, critical |
| Fired At | When the alert triggered |
| Status | Active, acknowledged, resolved, silenced |
| Duration | Time from firing to resolution |
Actions
| Action | Description |
|---|---|
| Acknowledge | Mark that someone is working on it — stops escalation |
| Resolve | Mark the issue as fixed |
| Silence | Suppress notifications for a duration (e.g., during maintenance) |
Cooldown
Set a cooldown period on alert rules to prevent notification flooding. During cooldown, the rule will not fire again even if the condition remains true.