Problem in VS Code Extention

DatabricksEngi1
Contributor

Until a few days ago, I was working with Databricks Connect using the VS Code extension, and everything worked perfectly.

In my .databrickscfg file, I had authentication configured like this:

 

 
[name]
host:
token:
 

When I ran my code, everything worked fine - it connected to Serverless by default.

Then, I added a new authentication profile in .databrickscfg because I wanted to use a specific cluster instead of Serverless:

 

 
[name]
host:
token:
cluster_id:

 

 

Since then, I haven’t been able to run any code from VS Code.
I keep getting the following error:

 

 
[CONNECT_URL_NOT_SET] Cannot create a Spark Connect session because the Spark Connect remote URL has not been set. Please define the remote URL by setting either the 'spark.remote' option or the 'SPARK_REMOTE' environment variable.
 

Even after reverting .databrickscfg to the previous working configuration, it still doesn’t work.

I also tried setting the SPARK_REMOTE environment variable (which I hadn’t done before), but now the code starts running and then fails after about a minute with a “maximum retries exceeded” message.

Has anyone else encountered this issue or knows how to fix it?
Thanks in advance.

K_Anudeep
Databricks Employee
Databricks Employee

Hello @DatabricksEngi1 ,

What's the DBR version and DB connect version you are using?

CONNECT_URL_NOT_SET occurs when creating a Spark Connect session without specifying the connect URL. I think you have fallen into DB Connect’s config-resolution rules, or the cfg file is somehow broken. I tried this internally, and it works fine for me.

Anudeep

 

Thank you for your answer!

I’m currently on Databricks Connect 15.4.10, and the DBR of the cluster I tried to run on is 17.0.

dkushari
Databricks Employee
Databricks Employee

Hi @DatabricksEngi1 - Please ensure you have a Python Venv set up for each Python version that you use with Databricks Connect. Also, I have given step-by-step ways to debug the issue, clear the cache, etc [Read the files and instructions carefully before running them]. Here is the code that I tested, switching between serverless and classic compute. And also my profile entry for your reference. Since cluster_id is already there in the profile I do not need to mention that in the code -

from databricks.connect import DatabricksSession

# Option 1: Use cluster_id from .databrickscfg automatically
# Since your [fielddemo] profile has cluster_id configured, just use the profile
spark = DatabricksSession.builder.profile("fielddemo").getOrCreate()

# Option 2: Use serverless compute
# spark = DatabricksSession.builder.profile("fielddemo").serverless().getOrCreate()

df = spark.read.table("dkushari_uc.dais2025.customers_iceberg")
df.show(5)
[fielddemo]
host             = https://workspace.cloud.databricks.com/
token            = dapiXXXXX
jobs-api-version = 2.0
cluster_id       = XXXX-12345-xxxxxx

 

View solution in original post

Thank you very much!

I cloned the repo to a new folder and reinstalled the venv and databricks connect. It works.