Tuesday
Let’s say we have big data application where data loss is not an option.
Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zones in the primary region (if only one zone is available – write will not be successful). Once data is being written to the primary region it is being copied asynchronously to the secondary region, - hence when primary is down (flood/asteroid/you-name-it outage) there is a possibility of losing the data.
Microsoft states that data is being copied asynchronously, however it seems like order of files is not guaranteed to be preserved. That means if your primary region is down and you are failing over on secondary – there is high probability that some/most of your delta tables would be inconsistent. Imagine delta-log files were successfully copied to the secondary, but some parquet files are missing – i.e. table is inconsistent or out-of-sync – reads would result into error. Or your parquet files are copied, but delta-log is not there yet – congratulations – you have lost some data and you would not even notice it as reads would be successful (i.e. silent data-loss).
It also seems like RA-GRS (read-access geo-zone-redundant storage) does not play well with Delta Lake due to the same issue with eventual consistency…
And this is not the whole picture yet. Microsoft states that there is Last-Sync-Time property that indicates the most recent time that data from the primary region is guaranteed to have been written to the secondary region. And rumors have it – we should not trust that property as well, as it is unreliable…
Databricks mentions several times in the documentation that Deep Copy functionality should be used to copy deltas table from primary to secondary regions in a consistent way. Sounds good in theory – does not however cover many of the streaming cases.
Let’s take for instance stateless delta-to-delta streaming. Deep Copy does not maintain history when data is being copied (for all tables, not only streaming) – hence some additional work is required on primary to map processed offsets with source table history to be able to re-start on secondary from the new (and correct) offset. Add to that use case stateful streams, delta-to-delta with multiple sources and so on.
Is there any straightforward way to minimize data-loss when failing over from primary to secondary? Has someone managed to implement successful primary-to-secondary failover?
Wednesday
In Azure and Databricks environments, ensuring zero data loss during a primary-to-secondary failover—especially for Delta Lake/streaming workloads—is extremely challenging due to asynchronous replication, potential ordering issues, and inconsistent states between regions. No fully “push-button” solution currently exists for seamless, streaming-consistent, and history-preserving primary-to-secondary failover, but there are some best practices and caveats critical for architects and teams to understand.
Geo-zone-redundant storage (GZRS) and read-access GZRS (RA-GZRS) replicate data across regions using async replication.
Writes in the primary region get strong zone-level consistency only if multiple zones are alive, but data is not immediately, transactionally present in the secondary region—it may lag, and file ordering is not guaranteed.
Delta Lake tables are sensitive to consistency between metadata (delta logs) and data (parquet files). If a failover occurs mid-replication, tables may be unusable or silently inconsistent (silent data loss).
The “Last-Sync-Time” property for geo-replicated accounts is not always reliable and should not be solely trusted for data cutover, as it may not reflect the true point of consistency.
Deep Copy in Databricks is the most robust way to move Delta tables across regions with consistency, but it is a batch operation—not suitable for low-latency streaming or near-real-time failover scenarios.
Deep Copy does not preserve streaming offsets, history, or application state, so “stateless delta-to-delta streaming” and especially stateful processes need extra work to reconstruct correct offsets and state after cutover.
Streaming jobs are difficult: you must map source table history/processed offsets, and resuming jobs on secondary requires nontrivial tracking and reconciliation.
No Azure/Databricks native primitives seamlessly preserve all history, offsets, and state during geo-failover.
While there is no single “straightforward” solution for Delta Lake streaming failover, here are methods to minimize risk:
Buffer critical writes and orchestrate explicit checkpoints. Before planned failover, pause streaming and ensure all deltas/committed parquet files are successfully replicated. Validate at file-level, not just “last-sync” markers.
Use Deep Copy periodically for batch tables, but supplement with logic to store stream offsets and handle recovery mapping between source/target tables for streaming use cases.
Implement application-level replication/checkpoint tracking: Record every write operation’s offset and status, and maintain a custom log in a replicated store for reconciliation post-failover.
Manual failback procedures: Post-failover, validate Delta tables for consistency (e.g., using Delta Lake’s vacuum/checkpoint tools), and re-validate streaming inputs before resuming.
Consider alternative architectures for sub-second RPO: For mission-critical data, synchronous cross-region replication (not currently supported for Azure Blob/Delta Lake) or third-party solutions (such as custom mirroring, or hybrid storage) may be required. These increase costs and complexity.
Regularly test failover and document recovery/runbooks. Don’t rely solely on cloud provider guarantees; end-to-end validation is a must.
Many organizations accept a low RPO (recovery point objective) and design for eventual consistency rather than true zero data loss.
No documented, fully successful, zero-loss, streaming-integrity-preserving failover for Delta Lake on Azure has been published to date.
Some users have implemented “acceptable” manual or semi-automated failover using Deep Copy plus custom offset/data mapping for critical workloads, but these solutions are complex and error-prone.
In summary:
There is currently no straightforward, out-of-box way to guarantee zero data loss or perfect table consistency when failing over from primary to secondary in Azure for Delta Lake, especially for streaming scenarios. Risk can be mitigated with careful orchestration, explicit checkpointing, custom failover logic, and regular validation and testing, but true zero-loss, instant failover is not supported or achievable with existing Azure or Databricks tooling alone.
Wednesday
In Azure and Databricks environments, ensuring zero data loss during a primary-to-secondary failover—especially for Delta Lake/streaming workloads—is extremely challenging due to asynchronous replication, potential ordering issues, and inconsistent states between regions. No fully “push-button” solution currently exists for seamless, streaming-consistent, and history-preserving primary-to-secondary failover, but there are some best practices and caveats critical for architects and teams to understand.
Geo-zone-redundant storage (GZRS) and read-access GZRS (RA-GZRS) replicate data across regions using async replication.
Writes in the primary region get strong zone-level consistency only if multiple zones are alive, but data is not immediately, transactionally present in the secondary region—it may lag, and file ordering is not guaranteed.
Delta Lake tables are sensitive to consistency between metadata (delta logs) and data (parquet files). If a failover occurs mid-replication, tables may be unusable or silently inconsistent (silent data loss).
The “Last-Sync-Time” property for geo-replicated accounts is not always reliable and should not be solely trusted for data cutover, as it may not reflect the true point of consistency.
Deep Copy in Databricks is the most robust way to move Delta tables across regions with consistency, but it is a batch operation—not suitable for low-latency streaming or near-real-time failover scenarios.
Deep Copy does not preserve streaming offsets, history, or application state, so “stateless delta-to-delta streaming” and especially stateful processes need extra work to reconstruct correct offsets and state after cutover.
Streaming jobs are difficult: you must map source table history/processed offsets, and resuming jobs on secondary requires nontrivial tracking and reconciliation.
No Azure/Databricks native primitives seamlessly preserve all history, offsets, and state during geo-failover.
While there is no single “straightforward” solution for Delta Lake streaming failover, here are methods to minimize risk:
Buffer critical writes and orchestrate explicit checkpoints. Before planned failover, pause streaming and ensure all deltas/committed parquet files are successfully replicated. Validate at file-level, not just “last-sync” markers.
Use Deep Copy periodically for batch tables, but supplement with logic to store stream offsets and handle recovery mapping between source/target tables for streaming use cases.
Implement application-level replication/checkpoint tracking: Record every write operation’s offset and status, and maintain a custom log in a replicated store for reconciliation post-failover.
Manual failback procedures: Post-failover, validate Delta tables for consistency (e.g., using Delta Lake’s vacuum/checkpoint tools), and re-validate streaming inputs before resuming.
Consider alternative architectures for sub-second RPO: For mission-critical data, synchronous cross-region replication (not currently supported for Azure Blob/Delta Lake) or third-party solutions (such as custom mirroring, or hybrid storage) may be required. These increase costs and complexity.
Regularly test failover and document recovery/runbooks. Don’t rely solely on cloud provider guarantees; end-to-end validation is a must.
Many organizations accept a low RPO (recovery point objective) and design for eventual consistency rather than true zero data loss.
No documented, fully successful, zero-loss, streaming-integrity-preserving failover for Delta Lake on Azure has been published to date.
Some users have implemented “acceptable” manual or semi-automated failover using Deep Copy plus custom offset/data mapping for critical workloads, but these solutions are complex and error-prone.
In summary:
There is currently no straightforward, out-of-box way to guarantee zero data loss or perfect table consistency when failing over from primary to secondary in Azure for Delta Lake, especially for streaming scenarios. Risk can be mitigated with careful orchestration, explicit checkpointing, custom failover logic, and regular validation and testing, but true zero-loss, instant failover is not supported or achievable with existing Azure or Databricks tooling alone.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now