- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2022 05:18 AM
Besides the CI/CD link I posted above, there is not a lot to be found.
A lot of companies deploy their scripts using jars. When using that approach you can apply software engineering practices to your scripts.
But that being said: Databricks is focusing heavily on notebooks (and python) and slowly steering away of the 'old' spark way of working.
It would be awesome to have a SE notebook framework, something that is lacking right now.
There is DBX, and delta live tables (python/sql only), but those are far from ideal.
Something that IS very valuable though are 'git for data' frameworks like LakeFS or Nessie. always use production data for dev/qa and commit to branches (which you can merge or not).