One Platform for Ops + Analytics: Lakebase in a CPG etail Lakehouse

Rishabh-Pandey
Databricks MVP

We run a large-scale retail analytics platform in the CPG domain on AWS Databricks, and for a long time we lived with an architectural compromise most data teams know well — a lakehouse for analytics, and a separate operational database for everything else. The operational layer handled the things that needed to be fast and transactional: pipeline job statuses, data quality alerts, user audit logs. But keeping that data accessible to our analytics layer meant ETL sync jobs, latency gaps, and two separate governance models to maintain. Every time an analyst needed to correlate a data quality event with a downstream metric, they were fighting the architecture to do it. Lakebase changed the equation. By moving our operational tables to Lakebase Provisioned — serverless Postgres running natively within Databricks — those tables automatically materialized as Delta tables in the lakehouse. The sync problem disappeared because there was no longer a sync. Ops and analytics teams now query the same data under the same Unity Catalog policies, with no coordination overhead between them. The latency win was the goal. The cost win was the bonus — scale-to-zero means we're only billed during active pipeline runs, not for infrastructure sitting idle in between. For a customer-facing engagement, reducing infrastructure cost without reducing capability is exactly the kind of outcome that builds long-term trust. For any architect working in CPG or retail on AWS, if operational and analytical data sprawl is slowing you down, Lakebase is the most elegant solution we've found to that problem.

Rishabh Pandey