cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

setup databricks connect on VsCode and PyCharm

seefoods
Valued Contributor

Hello Guyz,

Someone Know what's is the best pratices to setup databricks connect for Pycharm and VsCode using Docker, Justfile and .env file 


Cordially, 
Seefoods

1 REPLY 1

Gecofer
Contributor

Hi @seefoods!

Iโ€™ve worked with Databricks Connect and VSCode in different projects, and although your question mentions Docker, Justfile and .env, the โ€œbest practicesโ€ really depend on what youโ€™re trying to do. Hereโ€™s what has worked best for me:

1.- Databricks Connect โ†’ for local development inside VSCode

I mainly use Databricks Connect to:

  • run Spark code locally
  • get autocompletion and type hints
  • debug transformations
  • quickly test code before sending it to a Databricks cluster

Itโ€™s perfect for the development loop inside VSCode.

2.- Databricks VSCode extension โ†’ for running real workloads on Databricks

For me, this is the cleanest workflow when you want to run things on the platform:

  • run scripts/notebooks on a cluster
  • browse catalogs, tables, repos
  • switch profiles
  • see job logs and outputs

So the combination works really well:

Databricks Connect โ†’ local dev experience

VSCode extension โ†’ execute remotely on Databricks

3.- Use .databrickscfg for VSCode authentication

What is the Databricks extension for Visual Studio Code?

The Databricks VSCode extension expects credentials in: ~/.databrickscfg

[DEFAULT]
host = https://<workspace-url>
token = <PAT> 

Or multiple profiles:

[dev] 
host = https://dev-workspace 
token = <token-dev> 

[prod] 
host = https://prod-workspace 
token = <token-prod> 
VSCode reads this automatically. So for the plugin, .databrickscfg is definitely the best practice.

4.- When building external services (FastAPI, Uvicorn, etc.), .env is useful

For microservices calling Databricks Model Serving, I do use .env, for example:

DATABRICKS_HOST=https://<workspace-url> 
DATABRICKS_TOKEN=<token> 
MODEL_SERVING_ENDPOINT=/serving-endpoints/my-model/invocations 
Two options to load the .env:

1) Use --env-file directly with Uvicorn (supported natively). This works perfectly inside a Justfile:

run: 
  uvicorn app.main:app --host 0.0.0.0 --port 8080 --env-file .env 
2. Load .env inside your FastAPI app
 
from dotenv import load_dotenv
import os

load_dotenv()

host = os.getenv("DATABRICKS_HOST")
token = os.getenv("DATABRICKS_TOKEN")
endpoint = os.getenv("MODEL_SERVING_ENDPOINT") 
In this case, your Justfile stays:
 
run:
  uvicorn app.main:app --host 0.0.0.0 --port 8080 

 

And your service reads the .env programmatically.

 

At the end of the day, both approaches are valid and in my experience, which one you use really depends on the specific use case. For local development and running things directly on Databricks, I prefer the VSCode plugin + .databrickscfg. For external services (FastAPI, Uvicorn, Justfile deployments), .env works perfectly.

So depending on what you want to build, you will naturally choose one approach or the other.

 

Hope it helps! 

Gema ๐Ÿ‘ฉโ€๐Ÿ’ป