Migrating large on-premise ETL workflows to Databricks often goes wrong when teams try to โlift and shiftโ legacy logic directly into Spark. Poor data layout, tiny files, and inefficient partitioning can quickly hurt performance, so restructuring data and adopting Delta Lake early is crucial. Many also underestimate the need to redesign pipelines for distributed processing rather than step-by-step ETL. Cluster sizing, cost control, and missing orchestration workflows (dependencies, retries, alerts) are other common pain points. Security mapping and schema evolution issues can also cause failures. The key is to optimize data structures, modernize transformations, establish proper workflow orchestration, and test with realistic data volumes.