R1 is a leading provider of revenue management solutions for healthcare organizations, supporting hospitals, health systems, and physician groups across front-, middle-, and back-office revenue operations. Through its Phare Revenue Operating System, R1 combines AI, automation, analytics, and operational expertise to help providers improve financial performance, reduce administrative complexity, and accelerate revenue flow. Built around a platform approach, R1’s technology is designed to deliver real-time operational intelligence and workflow orchestration across the revenue cycle, from access and coding through claims management, denials, and reimbursement optimization.
These were the results of R1's migration of its healthcare revenue cycle Data Vault from Snowflake to Databricks, led by R1's Zheng Zhu and delivered with Lovelytics.
Healthcare revenue cycle analytics is a workload that must run multiple times a day across hundreds of billions of records, with the business watching the output. R1 runs a full Data Vault pipeline for these 847 dbt models across five layers, built on dbt Core, Delta Lake, and Unity Catalog on Databricks.
The final Information Mart layer, the output consumed by R1's analytics and reporting, comprises 35 tables representing the distilled, business-ready view of the entire revenue cycle.
R1's mandate to Lovelytics was clear: deliver it on time, validate the code's behavior, and maintain synchronization with the current codebase to ensure continuity of operations. The migration was delivered in 12 weeks on a 2XL SQL Warehouse, achieving a 77% reduction in per-run DBU costs.
Migrations slip by quarters when teams treat the work as a translation project rather than an engineering one. The SQL itself is rarely the hard part.
Structural differences between the platforms do most of the damage, and they surface in any sufficiently large warehouse. Below are challenges that R1 and Lovelytics had to address:
Key architectural and technical decisions that mattered
The team treated data engineering like software engineering: tested the transformation logic itself, and built every validation layer as a first-class deliverable.
Most migration validation stops at row counts and aggregate comparisons, which confirms the data moved, but not that the logic behind it is correct. For this migration, each dbt model was unit tested in isolation with fixed inputs and asserted outputs, so the team could validate the logic independently of the current production data.
That distinction matters. A model can match Snowflake on today’s data and still contain logic that breaks when the data changes. Unit testing the transformation gives confidence that the model is correct, not just that it happens to agree with Snowflake on a specific run. For a healthcare revenue cycle pipeline running multiple times a day.
The design pattern used was a routing macro that resolved every cross-layer reference at runtime to one of three targets: Delta outside Unity Catalog, Delta inside Unity Catalog, or federated Snowflake. A dbt variable controlled the mode for each run, allowing the same codebase to execute against all three sources without maintaining separate branches.
dbt variables control which mode is active for each run. A code generation script refreshes the Jinja allowlist whenever the Unity Catalog source list changes. The pattern is simple: each model reference checks Unity Catalog and Databricks first, then falls back to federated Snowflake if the data hasn’t moved. As each source migrates, the same model automatically starts reading from Databricks without requiring code changes or branch-specific rewrites.
“For revenue cycle analytics, correctness is not theoretical—the business depends on these pipelines for mission-critical use cases. The challenge was preserving the expected behavior and our ability to iterate, while the business logic and implementation continued to evolve, and that the target platform introduced different execution semantics around hash generation, merge behavior, dependency ordering, and incremental processing. The migration succeeded because we treated rigorous model-level testing, runtime routing, wave-based dependency management, and reconciliation on the data concepts as core engineering work from day one—not as cleanup after the SQL was translated.” — Zheng Zhu, R1
Three hash key types, three strategies:
Surrogate keys were expected to differ across platforms; reconciliation could not rely on hash equality. The team validated against natural business keys instead.
Each layer had its own comparison notebook, with schema-driven column exclusions generated from INFORMATION_SCHEMA for fields like surrogate keys and load timestamps. The test harness shipped with the migration code.
On Databricks, the largest cost drivers were addressed through targeted rewrites: joining against narrow staging dimensions instead of wide mart tables, applying incremental predicates to restrict MERGE targets, and splitting a single wide model into parallel sub-models where the dependency graph allowed it.
If you're about to run a migration of this scale, there are a few questions worth asking your delivery team in week one:
Cost optimization starts well before the bill arrives. Delivery risk starts well before the first failed validation run. In large-scale migrations, both are shaped by the decisions made before the first model moves: dependency mapping, execution strategy, validation design, data layout, and workload sizing.
Twelve weeks. 847 models migrated. Roughly 77% lower cost per run, with additional upside expected on Serverless.
R1’s Snowflake-to-Databricks migration succeeded because the team did the hard alignment work upfront: architecture, execution sequencing, validation, and engineering standards. That discipline made the effort more than a workload migration. It became a way to improve the operating model itself, with pipelines treated as production software: versioned, tested, measured, and continuously optimized.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.