Databricks Community

Nivethan_Venkat · ‎07-29-2025

A Quick Word Before We Dive In
Why Benchmark Matters
About the Benchmarking Tool - pgbench
Benchmarking
Repository & Reproducibility
Environment
Hardware / Configuration:
Executions
Databricks Lakebase - 240s Run:
Databricks Lakebase - 180s Run:
AWS Aurora (PGSQL) - 240s Run:
AWS Aurora (PGSQL) - 180s Run:
Interpreting pgbench summary
Workload details
Results at-a-Glance (4 Million‑Row Dataset, 180 Clients)
Key take-aways:
What Else is Observed?
Deep‑Dive: Parameter Tuning
Observability Shortcuts
Lakebase UI
Aurora
Conclusion - Key Takeaways and What’s Next
Caveats & Future Work
References

A Quick Word Before We Dive In

In Part-1 we introduced Databricks Lakebase architecture — essentially a PostgreSQL‑compatible OLTP layer that sits next to Delta tables inside the Databricks Lakehouse. If that’s new to you, start here and learn more on how to spin it up, connect a psql client, and load a starter dataset into its Postgres‑compatible front‑end. With the environment in place, it’s time to answer the next logical question:

“How does Lakebase behave under real OLTP pressure, and how does that compare to a well‑known managed Postgres?”

This article walks through the methodology, command lines, and metrics captured during benchmarking.

Why Benchmark Matters

Anecdotes and marketing slides are helpful, but nothing beats an empirical workload run under controlled conditions. Benchmarks reveal:

Trade-off	What we learn from benchmarks
Latency vs. Throughput	When does response time rise as you chase higher TPS?
Scalability limits	Does performance collapse once the buffer cache is cold?
Operational complexity	How do connection limits, poolers and locking behave at high concurrency?

It surfaces trade‑offs and behavioural differences so you can decide what matters for your application. With that in mind, this article records what we observed when running the same pgbench script against:

Databricks Lakebase
AWS Aurora (PostgreSQL engine)

About the Benchmarking Tool - pgbench

You can benchmark a Postgres‑compatible engine in many ways: custom micro‑services, JVM stress tests, and so on. For this study we use pgbench, the canonical tool that ships with PostgreSQL itself:

Generates a mix of single‑row selects, updates, and account transfers.
Lets you plug in a same script to better mimic your schema (can be found in the repo below).
Produces TPS and latency histograms that are easy to parse and visualise.

Benchmarking

Let's deep dive into the benchmarking of Databricks Lakebase with AWS Aurora

Repository & Reproducibility

Complete procedure for performing the benchmark of Databricks Lakebase is available in the github repo: https://github.com/dediggibyte/diggi_lakebase

Benchmark repo — README

Environment

Hardware / Configuration:

Dimension	Lakebase	Aurora DSQL (PostgreSQL)
Compute	1 CU (Capacity Unit)- 16GB RAM	1 router + 8 shards, db.r8g.large (8 vCPU each)
Storage	Delta cache (NVMe SSD)	gp3 100 GiB, 3 k IOPS
Region	us-east-2 (Ohio)	us-east-2 (Ohio)
Client VM	c7g.xlarge, same AZ	c7g.xlarge, same AZ

Executions

Databricks Lakebase - 240s Run:

pgbench -n \
  -h "$LAKEBASE_HOST" -p "$LAKEBASE_PORT" -U "$PGUSER" \
  -f custom_test.sql \
  -T 240 \
  -c 180 \
  -j 6 \
  "$PGDATABASE"

Environment variables exported — Lakebase (240s run)

Databricks Lakebase - 180s Run:

pgbench -n \
  -h "$LAKEBASE_HOST" -p "$LAKEBASE_PORT" -U "$PGUSER" \
  -f custom_test.sql \
  -T 180 \
  -c 180 \
  -j 6 \
  "$PGDATABASE"

Environment variables exported — Lakebase (180s run)

AWS Aurora (PGSQL) - 240s Run:

pgbench -n \
  -h "$AURORA_HOST" -p "$AURORA_PORT" -U "$PGUSER" \
  -f custom_test.sql \
  -T 240 \
  -c 180 \
  -j 6 \
  "$PGDATABASE"

Environment variables exported — Aurora (240s run)

AWS Aurora (PGSQL) - 180s Run:

pgbench -n \
  -h "$AURORA_HOST" -p "$AURORA_PORT" -U "$PGUSER" \
  -f custom_test.sql \
  -T 180 \
  -c 180 \
  -j 6 \
  "$PGDATABASE"

Environment variables exported — Aurora (180s run)

Interpreting `pgbench` summary

Field	Example	Take-away
Scaling factor	1	Internally`pgbench`multiplies scaling factor; for the loaded 4 M in respective Database system.
Clients	180	Simultaneous sessions hitting the server.
Threads	6	Worker threads on the benchmark driver; keep ≤ driver CPU cores.
Duration	240 s	Timed, steady-state window after a 2-s ramp-up.
Transactions processed	373 785	Divided by 240 s → TPS.
Latency average	103.60 ms	Mean client-perceived response time.
Failed Transactions	0 (0 %)	Deadlocks or serialisation retries.
Initial Connection time	24 952 ms	One-off cost of opening 180 connections.
TPS	1737	The headline throughput number.

Workload details

Parameter	Value
Tool	`pgbench` 16.9
Script	`custom_test.sql` - random look-ups + indexed updates
Dataset	4 000 000 rows (scale ~ 100)
Concurrency	180 clients (both engines)
Threads	6 (-j 6, matches vCPU of driver VM)
Durations	180 s and 240 s runs
Repeats	3 runs each; medians reported
Failures	0 % in every run

Results at-a-Glance (4 Million‑Row Dataset, 180 Clients)

Engine	Run length	TPS (median)	Avg Latency	Txn(s) Processed
Lakebase	180 s	1731	103.97 ms	267 613
Lakebase	240 s	1737	103.60 ms	373 785
Aurora PostgreSQL	180 s	1509	119.28 ms	241 034
Aurora PostgreSQL	240 s	1508	119.37 ms	331 148

Key take-aways:

Flat lines: Both engines kept TPS almost flat between 180s and 240s, indicating the buffer cache stayed warm.
Latency Delta: Lakebase averaged ~15 ms faster per transaction at the same concurrency.
Clean runs: Zero failed or aborted transactions across all tests.

What Else is Observed?

Region affinity: Our first Lakebase attempt used a driver VM in another AZ; TPS cratered by ~50 %. Lesson: keep client and database in the same AZ for OLTP benchmarks.
Data‑volume resilience: A pilot with only 1M rows clocked 1880 TPS on Lakebase. Bumping to 4 M rows shaved off ~8 % — a healthy sign.
Connection spikes: Spooling up 180 new sessions took 20–25 s on both engines. Harmless for steady workloads; something to watch for burst‑and‑idle patterns.

Deep‑Dive: Parameter Tuning

Knob	Databricks Lakebase	AWS Aurora	Why it matters
`shared buffers`	Ignored	75 % RAM	Aurora benefits from a large shared cache; Lakebase handles buffering internally.
work_mem	4 MB	32-64 MB	Impacts join & sort spilling; not hit in our micro-benchmark.
`max connections`	1024 hard-cap	500 x router	Dictates pooler settings.
Autovacuum	Auto	Auto	Neither engine needed vacuum tweaks for this workload.
Connection pooling	Advised	Advised	Smooths bursty client behaviour.

Observability Shortcuts

Lakebase UI

Monitor ▶︎ Lakebase shows live TPS, P95 latency, active connections, and storage utilisation%.

Databricks Lakebase Metrics

Aurora

CloudWatch metrics (DatabaseConnections, SelectLatency, CommitLatency) plus pg_stat_statements for top queries.

AWS Cloudwatch Metrics

Conclusion - Key Takeaways and What’s Next

In this post we ran a head‑to‑head pgbench benchmark on a 4 million‑row dataset—same script, same client count—against Databricks Lakebase and AWS Aurora (PostgreSQL). From seeding data to reading the latency histogram, a few things stood out:

Identical workload, distinct personalities: Lakebase’s vectorised execution path edged out Aurora on average latency (~15 ms per transaction) while both engines held steady throughput around 1.5–1.7 k TPS with zero failures.
Topology still matters: Keeping the driver VM in the same AZ as the database doubled Lakebase TPS versus an earlier cross‑AZ trial — a reminder that network round‑trips still rule OLTP.
Good defaults get you far: Out‑of‑the‑box settings (no shared_buffers tuning, no custom autovacuum) were enough to clear enterprise‑grade throughput on both platforms.
Connection spikes are the new cold start: Spooling up 180 sessions took ~20–25 s for both engines. If your workload bursts from zero, a pooler is mandatory.
Schema awareness pays dividends: Lakebase lost only ~8 % TPS when scaling from 1 M to 4 M rows, underscoring the value of tight indexing over brute‑force hardware.

Caveats & Future Work

Lakebase: Cross‑region DR, backup limits, and fail‑over speeds are still being hardened.
Chaos testing: An induced Aurora Limitless router fail‑over recovered in < 30 s; a forced Lakebase database restart recovered in ~ 20 s (smaller footprint, but worth retesting at GA).
Next stop - Part 3: We’ll put a price‑tag on these TPS numbers, dive into reserved‑instance math, and see how database autoscales (and bills) when the workload starts and stops. Stay tuned!

References

Disclaimer:
These results reflect each engine’s default configuration. Feedback is welcome — send your ideas and we’ll happily rerun the tests with any community‑driven tweaks.

Databricks Community

[PARTNER BLOG] Benchmarking Databricks Lakebase and AWS Aurora (PostgreSQL engine) using pgbench

A Quick Word Before We Dive In

Why Benchmark Matters

About the Benchmarking Tool - pgbench

Benchmarking

Repository & Reproducibility

Environment

Hardware / Configuration:

Executions

Databricks Lakebase - 240s Run:

Databricks Lakebase - 180s Run:

AWS Aurora (PGSQL) - 240s Run:

AWS Aurora (PGSQL) - 180s Run:

Interpreting `pgbench` summary

Workload details

Results at-a-Glance (4 Million‑Row Dataset, 180 Clients)

Key take-aways:

What Else is Observed?

Deep‑Dive: Parameter Tuning

Observability Shortcuts

Lakebase UI

Databricks Lakebase Metrics

Aurora

Conclusion - Key Takeaways and What’s Next

Caveats & Future Work

References

Metadata-Driven ETL Framework in Databricks (Part-1)

Top 10 query performance tuning tips for Databricks Serverless SQL

Best practices for safe data experimentation with Databricks

Databricks Community

[PARTNER BLOG] Benchmarking Databricks Lakebase and AWS Aurora (PostgreSQL engine) using pgbench

A Quick Word Before We Dive In

Why Benchmark Matters

About the Benchmarking Tool - pgbench

Benchmarking

Repository & Reproducibility

Environment

Hardware / Configuration:

Executions

Databricks Lakebase - 240s Run:

Databricks Lakebase - 180s Run:

AWS Aurora (PGSQL) - 240s Run:

AWS Aurora (PGSQL) - 180s Run:

Interpreting pgbench summary

Workload details

Results at-a-Glance (4 Million‑Row Dataset, 180 Clients)

Key take-aways:

What Else is Observed?

Deep‑Dive: Parameter Tuning

Observability Shortcuts

Lakebase UI

Databricks Lakebase Metrics

Aurora

Conclusion - Key Takeaways and What’s Next

Caveats & Future Work

References

Metadata-Driven ETL Framework in Databricks (Part-1)

Top 10 query performance tuning tips for Databricks Serverless SQL

Best practices for safe data experimentation with Databricks

Interpreting `pgbench` summary