Skip to content
QA ATLAS Releasestrategy QA LVL4 Dataqualitytests

QA ATLAS Releasestrategy QA LVL4 Dataqualitytests

Andi Lamprecht Andi Lamprecht ·· 2 min read· Accepted
ADR-0028 · Author: Sybil Melton · Date: 2025-02-07 · Products: platform
Originally ADR-0051 QA_ATLAS-ReleaseStrategy-QA-LVL4-DataQualityTests (v7) · Source on Confluence ↗

Release Strategy - QA - Level 4 - Data Quality Tests

Context

Data quality tests play a pivotal role in today’s data-driven world, serving as the guardians of data integrity and reliability. These tests encompass a wide array of assessments, each focused on ensuring that data is accurate, consistent, complete, and trustworthy. They are applied to dataset as a whole, testing each row of a data to ensure maximum total quality.

Those tests informs about dataset properties such as:

Data consistency

These checks assess whether all required data elements are present in a dataset. It involves verifying if there are any missing values or empty fields as well as ensure that data is uniform and follows an expected format or structure throughout the dataset.

Validity Tests

Validity tests assess whether data conforms to predefined constraints.

Data Profiling

Data profiling involves generating statistical summaries, histograms, and distribution analyses to understand the characteristics of the data. It can reveal anomalies and outliers.

Data Anomaly Detection

Advanced data quality tests may involve machine learning algorithms to detect anomalies and outliers that human-driven tests may miss.

Application

Atlas performs data quality tests on 2 levels:

  • Incoming data
  • Output data

Incoming data

Atlas lack control over the consistency and validity of the data incoming from the external sources. On this stage data quality tests can be used to constant monitoring of the alignment of the incoming data to the assumptions made on it during the system’s design.

Output data

Checks on this stage serves as a proof for Atlas consumers that the data has a high quality and it’s format is aligned to the one specified within a data contract.

Main goals of this test level

Data quality tests aim to ensure the accuracy and reliability of data by identifying and rectifying errors, inconsistencies, and anomalies within the dataset. These tests serve a dual purpose, validating both the incoming data from external sources and the output of our data pipeline to adhere to agreed data contracts. Ultimately, the objective is to furnish stakeholders with dependable, high-quality data they can trust for their needs.

This test level gives an answer to the question:

Is the data incoming from external sources aligned with our expectations for its format and quality?
Is our system design to support the data incoming from external sources?
Is the system generating data in compliance with the agreed-upon data contract?
Last updated on