- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2022 10:26 PM
what are the problem that databricks address from typical data engineering prespective and comparing with other cloud DE tools.
- Labels:
-
Data Engineering
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2022 12:42 AM
Can´t speak for others, but for me it is a combination of things:
- autoscaling
- tuned spark clusters
- extra features like Databricks SQL, MLFlow etc
- frequent updates
- reliability
I probably forget a few things, but what is also a great asset is (and this is mainly Spark related and not only Databricks) that you can use python/scala instead of SQL.
Don´t get me wrong: sql is an excellent tool, but when things get complicated I prefer a GP programming language over SQL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-29-2022 10:44 PM
Annoying things databricks solves
- Sane Data Movement (Fast Parallelized Compute, Table Versioning and History)
- Environment Management (spark + delta + java) are installed out-of-the-box
- Cost and Job Monitoring (Overwatch)
I've only worked with it for 6 months and it's really a platform you can build internal practices upon with little overhead.