About baseline calculations
You can view the calculated baseline values and the percentage change in the Adaptive rule actions drawer on the Monitors tab of the Job Details page. The Adaptive rule actions drawer shows baseline values for the following adaptive rules:
- Data type
- Empty fields
- Max values
- Mean values
- Min values
- Null values
- Row count
- Uniqueness
For detailed steps on accessing this information, go to Review monitor results.
How the baseline is determined
The baseline represents the "normal" state of your data. Collibra calculates this by looking at a rolling window of historical runs.
The system analyzes the last N successful runs (where N is the data lookback period you configured) to compute statistical measures, such as the median and standard deviation.
- Lookback period: This is the specific number of past runs the system analyzes. You can configure this in the Monitors step when creating or editing a job (the default lookback is 21 runs).
- Rolling window: As new runs complete, the window shifts. Old runs drop off the baseline calculation to ensure the profile reflects the most current data behavior.
The comparison process
When a new job run completes (the "current run"), the system does not include it in the baseline immediately. Instead, it compares the current run against the baseline established by the previous runs. The examples in the process below assume a lookback period of 5, meaning 5 runs are required to train the behavioral model and establish a baseline. The current run in this scenario is Run 10.
- Historical analysis: The run currently being profiled (Run 10) is never included in its own baseline, for example, Runs 5 through 9.
- Comparison: The system compares the current run (for example, Run 10) against the statistics from runs after establishing a baseline.
- Percentage change: The system calculates the percentage difference between the value in the current run and the baseline value. The result is a positive or negative percentage (for example, +30% or -30%) or 0% if there is no change.
Exclusions and resets
To ensure accuracy, the system excludes certain data from the baseline calculation:
- Current run: The run currently being profiled is never included in its own baseline.
- Failed or breaking runs: Only runs with a passing status are used. Runs marked as "breaking" via adaptive rules or annotations are excluded to prevent outliers from skewing the "normal" baseline.
- Suppressed adaptive rules: Suppressing adaptive rules does not affect the learning period of the behavioral model. Consequently, the system includes suppressed adaptive rules in the baseline calculation.