About Data Quality Jobs

A Data Quality Job is a group of columns from one or more tables that are evaluated by specific monitors and profiling actions. These monitors help you ensure the data within the job's scope is accurate and reliable for reporting and analysis. A job is not an asset. However, it is presented in Collibra in a way that closely resembles an asset.

Jobs allow you to:

  • Conduct immediate and scheduled data quality checks on your data.
  • Create data profiles of your data.
  • Apply automatic monitoring to track the evolution of your data.
  • Configure automated notifications to send to users when certain conditions are met.

A job includes a scope query to select data from one or more tables, and various settings to define how and when to run the job with specific monitors. A scope query is the specific SQL query within a job that dictates exactly which columns and rows from a target table are evaluated during a job run. A job consists of the following components:

  • Scope query
  • Schedule
  • Filters:
    • Time slice
    • Row
    • Limit (sample size)
  • Logs
  • Profile
  • Monitors
  • Permissions

Depending on the data quality capability that your Edge or Collibra Cloud site uses, every job relies on either the Pushdown or Pullup processing method. Pushdown jobs use the Data Quality Pushdown Processing capability, while Pullup jobs use Data Quality Pullup Processing. Pullup jobs rely on Spark to handle job processing workloads, so the steps to create a job for either method differ.

For detailed steps on how to create a job using either method, go to Create a Pushdown job or Create a Pullup job.

What's next