ā11-28-2025 09:05 PM
Hi everyone,
Iām trying to set up a smooth local-development workflow for Databricks and would love to hear how others are doing it.
I do most of my development in Cursor (VS Code-based editor) because the AI agents make coding much faster.
After development, I push code to Git, then open Databricks and pull the repo, and only then can I run and test the code inside a Databricks notebook or job.
This back-and-forth is slow ā Iād like to run / test directly from my local IDE if possible.
I saw that the Databricks docs mention a VS Code extension, but:
Cursor doesnāt seem to allow me to install this extension.
Not sure if Cursor supports Databricks extensions at all.
Has anyone successfully used the Databricks VS Code extension inside Cursor?
I also tried Databricks Connect. Tutorials show connecting to a Personal Compute cluster, but:
In my organization, compute is owned by a principal / service account.
Iām added as a user, but when I list clusters through Databricks Connect, I donāt see any clusters.
So the connect step fails.
Not sure if this is a permissions issue, or if Databricks Connect requires personal compute only.
How are you all developing locally and executing code in Databricks?
Do you run code locally against DBFS / clusters, or do you push to repos and test in notebooks?
Does Databricks Connect work with shared or service-principalāowned clusters?
Or only with Personal Compute?
Is there any known workaround to make the VS Code extension work in Cursor?
Is there any other method Iām missing for local development + remote execution?
Any advice, examples, or even your workflow setups would be super helpful.
Thanks!
ā11-30-2025 08:24 AM - edited ā11-30-2025 08:25 AM
Hi @adhi_databricks ,
Since cursor is based on open source vs code you should be able to install databricks extenstion.
Check below video. It shows how you can setup cursor to work with databricks locally š
ā11-30-2025 08:24 AM - edited ā11-30-2025 08:25 AM
Hi @adhi_databricks ,
Since cursor is based on open source vs code you should be able to install databricks extenstion.
Check below video. It shows how you can setup cursor to work with databricks locally š
ā11-30-2025 12:32 PM
@adhi_databricks: I want to add my perspective when it comes to pure local development (without Databricks connect).
I wanted to setup a local development environment without connecting to Databricks workspace/cloud storage; develop PySpark code in VSCode using local Spark and GenAI. After local development, I wanted to deploy code to Databricks workspace for Data Engineering alone.
Faced following challenges whether we use Gen AI or Not.
1) Notebook Architecture
- we import notebooks using %run. Linting support is minimal(none) in Notebooks. we are facing runtime errors for syntax; sometimes indent issues also. It must have been caught during formatting and linting phase.
- preparation of coverage reports, code quality checks using tools like Sonar and executing test cases is challenging
- linting, testing, code quality analysis and coverage check must be done before the code is pushed to Databricks workspace in test or prod environment
2) Strong Dependencies with Cloud storage, Delta and Meta-store (HIVE or UC)
- Difficult/unable to mock cloud storage/delta/meta-store
- Unable to test all parts of the code due to Dependencies
We were able to overcome some of the issues using the following approach,
1) Create a pure python package that has framework or reusable code - code that does not change often. This will be formatted, linted, tested and built to wheel package. This is a common package and has its own lifecycle. And developers will install it locally in a Python venv.
2) Create features/changes that are to be introduced in the current release/sprint. This will eventually become a wheel, but at the moment can be separate modules in the vscode. This will import required modules from the common package. Code formatting, Linting and Unit testing can be automated for these changes - with Gen AI help.
3) Dependency handling - Delta Lake/Hive abstraction implemented using Docker - this is time consuming, but possible. When developers start the VSCode, Docker container can also start, and dependencies can be made ready. It is not a smooth setup; there are issues with this.
Hope this helps