Data Quality
Define quality rules, evaluate data, and track quality scores.
Overview
Data Quality lets you define rules that continuously evaluate your data and surface issues before they reach downstream consumers.
Rule Types
| Dimension | What It Measures | Example Rule |
|---|---|---|
| Completeness | Missing or null values | “temperature must not be null” |
| Accuracy | Values within expected range | “temperature between -50 and 200” |
| Consistency | Cross-field agreement | “start_time < end_time” |
| Validity | Format and type correctness | “serial_number matches regex ^[A-Z]{2}\d{6}$” |
| Timeliness | Data freshness | “last record within 5 minutes” |
Creating Rules
- Navigate to Analyze > Data Quality
- Click Create Rule
- Select the target data source or pipeline
- Choose the dimension and define the condition
- Set severity (info, warning, critical)
- Save
Evaluation
Rules are evaluated on a schedule or triggered by pipeline events:
- Scheduled: Runs at a configured interval (e.g., every 5 minutes)
- On-write: Evaluates when new data arrives via a pipeline
Each evaluation produces a pass/fail result per rule with details on failing records.
Quality Scores
Each data source receives an aggregate quality score (0-100%) based on:
- Percentage of passing rules
- Severity weighting (critical rules count more)
- Recent trend (improving or degrading)
View scores on the Data Quality dashboard with historical trend charts.
Quality Alerts
Attach alert actions to quality rules:
- Open a rule
- Under Alerts, configure a notification channel
- When the rule fails, an alert fires to the configured channel
See Alerts for channel configuration details.
Monitoring
The Data Quality overview page shows:
- Overall quality score per data source
- Failing rules with severity indicators
- Trend over time (last 24h, 7d, 30d)
- Drill-down to individual rule results