Ask a data engineer to draw the architecture of their stack, and you’ll often see the same two things: OLTP systems on the left, a data warehouse or lakehouse on the right. Transactions on the one hand, Analytics on the other.
What’s missing? The middle.
Operational teams, such as support agents, fraud analysts, ops managers, and field service coordinators, don’t need analytics on stale data. They need the current state of the world, right now, to act on. Sub-minute freshness. Fast queries. Data from multiple source systems integrated into a single operational view.
The Operational Data Store (ODS) was designed specifically for this. The concept is decades old, but the need has never gone away. Most modern stacks just don’t name it or build it cleanly – and they pay the price in fragile workarounds.
Lakebase on Databricks is a natural fit for a modern ODS.
The book Building the Operational Data Store (W. H. Inmon, Claudia Imhoff, and Greg Battas) defines ODS as “a subject-oriented, integrated, current, volatile collection of data used to support tactical decision-making.”
The key tenets from this definition are:
An ODS is not:
The ODS is a current-state data product layer. It’s an operational integration store that sits between transactional systems and the analytical layer.
Figure 1: ODS Positioning
A practical routing test: if the question is about the current state of a specific entity and the answer drives an immediate action, it belongs in the ODS. If it’s about trends, aggregates, or patterns across time or population, it belongs in the lakehouse.
|
Ask the ODS (Lakebase) |
Ask the Data Warehouse / Lakehouse |
|
Operational – current state |
|
|
The operations command center needs the current fulfillment status of order #12345 across order, payment, and shipment systems. |
Supply chain leadership dashboard: What was our order fulfillment rate last quarter? |
|
The fraud console shows that the customer’s account is currently flagged for fraud. |
Risk analytics team: Which fraud patterns are most prevalent this year? |
|
The support dashboard looks for open support cases for customer X. |
Customer success leadership quarterly review: How do support resolution times trend by customer tier? |
|
Financial workflows need to know a customer’s current risk score before extending credit. |
Fraud data science team: How has fraud model accuracy changed month-over-month? |
|
Operational – inventory and field |
|
|
The store manager wonders what is the current sellable inventory for product Y at store X (on-hand, reserved, and in-transit). |
Merchandising Director: Which products have the highest stockout frequency by region? |
|
The customer support rep needs to know what technicians have worked on the work order Z in the last week. |
Service delivery dashboard: Which regions have the highest SLA breach rates? |
|
Cross-system integration |
|
|
Daily, the account manager checks the combined view of her customers across CRM, billing, and support. |
Growth analytics team: How does retention differ between customers active in both CRM and ecommerce? |
|
While resolving a payment issue, the support escalation team looks for a matching payment record in the payments system against a sales order. |
FinOps Dashboard: What percentage of orders have mismatched payment records by channel over the last year? |
|
The relationship manager needs a consolidated account status across all product lines for a customer. |
Product marketing monthly review: Which product line combinations have the highest cross-sell success rate? |
Most organizations have an ODS need, though few build it intentionally. Instead, they end up with a collection of fragile workarounds:
The result is operational pain that compounds. Examples could be: Inconsistent data definitions across apps (because each team defines “customer status” differently). Duplicated ETL logic (same joins written in five different places). Production OLTP systems are under load from reporting queries because there is no single authoritative current-state view.
You end up with the ODS pattern anyway, just without naming it, designing it, or governing it.
Figure 2: ODS Anti-Pattern
Lakebase is a managed Postgres service inside the Databricks platform. That combination of capabilities maps cleanly onto ODS requirements:
|
ODS Need |
Lakebase Capability |
|---|---|
|
Low-latency operational queries |
Standard Postgres interface, millisecond response |
|
Current-state entity storage |
Upserts, transactional writes, ACID semantics |
|
Relational integration across entities |
Relational joins across Postgres tables |
|
Operational app connectivity |
Standard Postgres drivers – no special client needed |
|
CDC/streaming ingestion |
Kafka/Debezium events flow through Streaming ETL (Structured Streaming or Lakeflow Spark Declarative Pipelines) into the Lakehouse (or to Lakebase directly), then sync into Lakebase via Synced Tables |
|
Lakehouse integration |
Native syncing from Unity Catalog via Synced Tables |
|
Safe schema evolution |
Branching lets you test schema changes against a copy of the data before applying to production |
The branching capability deserves emphasis. In a live operational system, schema changes are risky. Lakebase branches let you fork the database, test a migration, and promote it confidently – without any downtime to the operational apps reading the main branch.
Here’s what a well-designed Lakebase ODS architecture looks like end-to-end:
The lakehouse retains the full history and remains the system for heavy analytical computation. Lakebase serves the operational ‘present’.
Figure 3: Reference Architecture
The Synced Tables feature is what makes Lakebase particularly compelling for ODS design. It lets you create a Postgres table in Lakebase that automatically mirrors a Unity Catalog table. No pipelines to maintain, no reverse ETL logic to write.
From an ODS perspective, this is the key primitive because it creates a clean, governed split:
Some important things to know about Synced tables:
The ODS superpower is query-time joins between synced curated data and native writable operational tables.
Logic computed once in the lakehouse surfaces instantly at query time in the operational context, without duplicating that logic in every app.
Another useful pattern is to bring in two synced tables from two different OLTP domains and join them in Lakebase at query time.
Figure 4: Synced Tables Zoom-In
These are the read-only backbone of the ODS. Synced from gold-layer Unity Catalog tables:
For operational context that needs recent – but not full – history:
Keep these scoped. Long historical analysis belongs in the lakehouse.
These are app-owned:
Writes to native tables should be idempotent so retries are safe. Use INSERT ... ON CONFLICT DO UPDATE (Postgres upsert / merge) with a stable primary key.
Rather than hard deleting rows, use a deleted_at timestamp or is_active flag. This keeps audit trails intact for operational workflows.
Make freshness explicit and queryable. Add to every synced table a column to identify when the source record was committed in the upstream systems. Also have a mechanism to locate when Synced Tables last refreshed this row into Lakebase.
Apps and dashboards can surface staleness warnings when data is stale.
Index for your operational query patterns – not analytical scan patterns: customer_id lookups (support cockpit) - case_status + assigned_to (work queue filtering) - risk_score + flagged_at (fraud triage ordering).
Don’t copy analytical index strategies. Tune for the app.
Different parts of your ODS will have different freshness requirements. Inspired by the ODS class framework from Corporate Information Factory by W. H. Inmon, Claudia Imhoff, and Ryan Sousa, here’s how those tiers map to Lakebase:
|
Freshness tier |
Latency |
Pattern |
Lakebase fit |
|---|---|---|---|
|
Real-time |
Seconds or lower |
Near-synchronous with source. Minimal transformation |
Zerobus into lakehouse + Continuous Synced Tables, or Streaming writes into Lakebase. |
|
Near-real-time |
Minutes |
Store-and-forward. Meaningful integration possible |
Scheduled/triggered Synced Tables |
|
Batch |
Hours / Daily |
Overnight processing. Rich integration and transformation |
Scheduled Synced Tables. Daily pipeline outputs |
|
Feedback |
Irregular / Event-driven |
Lakehouse-derived scores and segments pushed back to ODS |
Gold-layer Unity Catalog tables synced into Lakebase |
Lakebase is especially strong for real-time and near-real-time serving (where freshness is critical) and for the feedback tier (where the lakehouse enriches the ODS with computed intelligence).
Most real-world implementations span multiple tiers: fraud signals at real-time, support enrichment at near-real-time, reference dimensions at batch.
The ODS is easy to abuse once teams realize it’s fast and queryable. Avoid:
With Lakebase specifically:
The ODS never went away. We just stopped drawing it on architecture diagrams – and replaced it with a tangle of point-to-point pipelines, ad hoc caches, and abused warehouse tables.
Lakebase makes it straightforward to build this layer cleanly inside the Databricks ecosystem, transforming the future architecture to:
That’s a much better pattern than forcing your warehouse to act like an app database or letting operational dashboards hammer production OLTP.
If your team is wrestling with operational analytics, support tooling, fraud workflows, or real-time ops visibility, you don’t need a new concept. You need the concept the industry already figured out, rebuilt cleanly on a modern stack.
Build the ODS. Use Lakebase.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.