โ08-23-2022 12:17 AM
How to Develop Locally on Databricks with your Favorite IDE
dbx is a Databricks Labs project that allows you to develop code locally and then submit against Databricks interactive and job compute clusters from your favorite local IDE (AWS | Azure | GCP) such as VS Code, PyCharm, IntelliJ, or Eclipse.
dbx is an extension of the Databricks CLI and also makes it easy to manage multiple execution environments and deployment configurations as well as pre-built templates for integration with popular CI tools such as GitHub Actions, Azure DevOps, and GitLab.
Databricks has an official extension for VS Code to be able to execute code written locally against jobs or all purpose clusters. In addition, there is an official Databricks Driver for SQLTools in VS Code to browse SQL objects and run SQL queries in Databricks workspaces from VS Code.
Let us know in the comments if you have had a chance to test out dbx or our VS Code plugins for local IDE development!
โ08-23-2022 12:32 AM
I use it for local development of our libraries. Works fine, but I did not yet use it to submit to clusters.
โ08-30-2022 07:58 AM
I've found "dbx" really interesting. In particular it makes interactions with databricks from the local environment very smooth. I love how the dbx documentation describes the entire development process: it's the first time I see support for good engineering practices on the development phase.
โ
There's one use case that dbx is not helping me with. I would like to develop a model locally using only pyspark and accessing the data on dbfs. I'm willing to use "dbutils", but to run it locally I need 'databricks-connect', which doesn't support databricks runtime 11, the latest one. Is there any other way to use dbutils locally?
โ08-31-2022 09:56 AM
Hi @Matias Marenchinoโ unfortunately you cannot run dbutils locally but if you can use dbx execute against an interactive cluster for a more interactive development experience.
โ01-03-2023 06:51 AM
We could build a helper function that detects if we are running on a generic pyspark versus on a Databricks cluster. That way, when databricks dbutils aren't available, we'd have a stand-in that would allow us to work disconnected until our code is ready to deploy to a cluster. @Isaac Gritzโ
โ11-18-2022 06:51 AM
dbx is great for deploy, but hopefully spark connect could be released as soon as possible
โ01-04-2023 01:04 PM
I'm actually not a fan of dbx. I prefer the AWS Glue interactive sessions way of using the IDE. It's exactly like the web notebook experience. I can see the reason why dbx exists, but I'd still like to use a regular notebook experience in my IDE.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group