what are the key Data engineering problems that databricks solve?

harrisriaz
New Contributor

what are the problem that databricks address from typical data engineering prespective and comparing with other cloud DE tools.

-werners-
Esteemed Contributor III

Can´t speak for others, but for me it is a combination of things:

  • autoscaling
  • tuned spark clusters
  • extra features like Databricks SQL, MLFlow etc
  • frequent updates
  • reliability

I probably forget a few things, but what is also a great asset is (and this is mainly Spark related and not only Databricks) that you can use python/scala instead of SQL.

Don´t get me wrong: sql is an excellent tool, but when things get complicated I prefer a GP programming language over SQL.

Rheiman
Contributor II

Annoying things databricks solves

  1. Sane Data Movement (Fast Parallelized Compute, Table Versioning and History)
  2. Environment Management (spark + delta + java) are installed out-of-the-box
  3. Cost and Job Monitoring (Overwatch)

I've only worked with it for 6 months and it's really a platform you can build internal practices upon with little overhead.

View solution in original post