Databricks Community

antoalphi · ‎04-08-2026

In Databricks Lakeflow Connect for MySQL (currently in public preview), Databricks recommends limiting each ingestion pipeline to around 250 tables, with validated testing up to 1 TB of snapshot data.

However, in real-world enterprise scenarios, customers often have significantly larger environments for example, thousands of tables (e.g., 6,000–7,000) and data volumes exceeding multiple terabytes.

To accommodate this, we are required to create multiple ingestion pipelines. Since each pipeline typically provisions its own compute resources (clusters), this can lead to:

Increased infrastructure costs due to multiple clusters running in parallel
Higher operational overhead in managing multiple pipelines
Customer dissatisfaction due to perceived inefficiency and cost escalation

This raises an important challenge:
How can we design a scalable ingestion strategy that handles large table volumes and data sizes efficiently, while minimizing compute cost and avoiding unnecessary cluster proliferation?

Sumit_7 · ‎04-08-2026

@antoalphi I think you have already answered in the first line itself - Public Preview -- meaning not full developed for general/real use. Hence it comes with limitations or bugs which will be covered in General Available. Though still you may consider following points to tackle effectively:

Group tables - split pipelines by schema / domain / size (not randomly)
Use incremental (CDC) instead of full snapshot - reduces compute drastically
Orchestrate pipelines (Databricks Workflows) - run sequentially or staggered to avoid many clusters at once
Use serverless pipelines (where supported) - reduces cluster management overhead

Hope this helps, thanks.

Databricks Community

Databricks Lakeflow Connect for MySQL

🌟 Community Pulse: Your Weekly Roundup! May 11 – 17, 2026

DAIS 2026 Speaker Spotlight Series #5 | Jasmeet Jaggi

Databricks Community Champion - May 2026 - Balaji J

Solution Accelerator Series | Media Mix Modeling (MMM)

DAIS 2026 | Community Virtual Contest – Showcase Your Skills & Win Exclusive Swag