ATLAS Releasestrategy 6 Datarollback
Originally
ADR-0046 ATLAS-ReleaseStrategy-6-DataRollback (v7) · Source on Confluence ↗Release Strategy - Data Rollback
Context
Data rollback is a critical process in data management that involves reverting to a previous state or version of a dataset, database, or system. It may be necessary when data becomes corrupted, inaccurate, or when undesirable changes are made, either intentionally or unintentionally.
Rollback is essential to restore data to a reliable and consistent state, ensuring its integrity and accuracy. There are several scenarios where data rollback might be required, such as when a software update introduces bugs or errors, when data is mistakenly deleted or overwritten, or when a security breach compromises data quality.
By allowing organizations to undo changes and return to a known, reliable state, data rollback plays a pivotal role in maintaining data quality, ensuring compliance, and safeguarding critical information. It serves as a crucial mechanism for data recovery and restoration, minimizing the potential risks and consequences of data-related issues.
Atlas application
Atlas enables two rolling back mechanisms for different scenarios.
Build-it Delta lake tables rollback
Rollback the data change on storage level via Delta lake time-travel mechanism.
Examples:
Spark SQL for restore by commit number
RESTORE TABLE table_name TO VERSION AS OF <version_number_of_correct_data>or
Spark SQL for restore by timestamp
RESTORE TABLE table_name TO TIMESTAMP AS OF <timestamp_when_the_data_was_in_correct_state>Flush Run
Rollback the change by regenerating the whole dataset and overwriting corrupted one.