cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem in VS Code Extention

DatabricksEngi1
New Contributor III

Until a few days ago, I was working with Databricks Connect using the VS Code extension, and everything worked perfectly.

In my .databrickscfg file, I had authentication configured like this:

 

 
[name]
host:
token:
 

When I ran my code, everything worked fine - it connected to Serverless by default.

Then, I added a new authentication profile in .databrickscfg because I wanted to use a specific cluster instead of Serverless:

 

 
[name]
host:
token:
cluster_id:

 

 

Since then, I haven’t been able to run any code from VS Code.
I keep getting the following error:

 

 
[CONNECT_URL_NOT_SET] Cannot create a Spark Connect session because the Spark Connect remote URL has not been set. Please define the remote URL by setting either the 'spark.remote' option or the 'SPARK_REMOTE' environment variable.
 

Even after reverting .databrickscfg to the previous working configuration, it still doesn’t work.

I also tried setting the SPARK_REMOTE environment variable (which I hadn’t done before), but now the code starts running and then fails after about a minute with a “maximum retries exceeded” message.

Has anyone else encountered this issue or knows how to fix it?
Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

dkushari
Databricks Employee
Databricks Employee

Hi @DatabricksEngi1 - Please ensure you have a Python Venv set up for each Python version that you use with Databricks Connect. Also, I have given step-by-step ways to debug the issue, clear the cache, etc [Read the files and instructions carefully before running them]. Here is the code that I tested, switching between serverless and classic compute. And also my profile entry for your reference. Since cluster_id is already there in the profile I do not need to mention that in the code -

from databricks.connect import DatabricksSession

# Option 1: Use cluster_id from .databrickscfg automatically
# Since your [fielddemo] profile has cluster_id configured, just use the profile
spark = DatabricksSession.builder.profile("fielddemo").getOrCreate()

# Option 2: Use serverless compute
# spark = DatabricksSession.builder.profile("fielddemo").serverless().getOrCreate()

df = spark.read.table("dkushari_uc.dais2025.customers_iceberg")
df.show(5)
[fielddemo]
host             = https://workspace.cloud.databricks.com/
token            = dapiXXXXX
jobs-api-version = 2.0
cluster_id       = XXXX-12345-xxxxxx

 

View solution in original post

4 REPLIES 4

K_Anudeep
Databricks Employee
Databricks Employee

Hello @DatabricksEngi1 ,

What's the DBR version and DB connect version you are using?

CONNECT_URL_NOT_SET occurs when creating a Spark Connect session without specifying the connect URL. I think you have fallen into DB Connect’s config-resolution rules, or the cfg file is somehow broken. I tried this internally, and it works fine for me.

Anudeep

 

Thank you for your answer!

I’m currently on Databricks Connect 15.4.10, and the DBR of the cluster I tried to run on is 17.0.

dkushari
Databricks Employee
Databricks Employee

Hi @DatabricksEngi1 - Please ensure you have a Python Venv set up for each Python version that you use with Databricks Connect. Also, I have given step-by-step ways to debug the issue, clear the cache, etc [Read the files and instructions carefully before running them]. Here is the code that I tested, switching between serverless and classic compute. And also my profile entry for your reference. Since cluster_id is already there in the profile I do not need to mention that in the code -

from databricks.connect import DatabricksSession

# Option 1: Use cluster_id from .databrickscfg automatically
# Since your [fielddemo] profile has cluster_id configured, just use the profile
spark = DatabricksSession.builder.profile("fielddemo").getOrCreate()

# Option 2: Use serverless compute
# spark = DatabricksSession.builder.profile("fielddemo").serverless().getOrCreate()

df = spark.read.table("dkushari_uc.dais2025.customers_iceberg")
df.show(5)
[fielddemo]
host             = https://workspace.cloud.databricks.com/
token            = dapiXXXXX
jobs-api-version = 2.0
cluster_id       = XXXX-12345-xxxxxx

 

Thank you very much!

I cloned the repo to a new folder and reinstalled the venv and databricks connect. It works.