cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Hello, everyone. I want to ask if there is a way to connect Databricks cluster with SSH interpreter in your IDE? I know about databricks connect but I want to execute the entire code in the cluster.

BorislavBlagoev
Valued Contributor III
33 REPLIES 33

Prabakar
Databricks Employee
Databricks Employee

Hi @Borislav Blagoev​  unfortunately it is not possible to connect to the cluster other Databricks connect.

BorislavBlagoev
Valued Contributor III

Is it possible to execute the entire code in the databricks cluster instead only the spark code?

Prabakar
Databricks Employee
Databricks Employee

For Spark jobs, you can use Databricks connect.

To use Python code to run SQL commands on Databricks clusters and Databricks SQL endpoints you can use the Databricks SQL Connector for Python.

BorislavBlagoev
Valued Contributor III

I want to execute Python code as well. The entire code (Spark, SQL, Python).

-werners-
Esteemed Contributor III

hm I think plain python code will run with databricks connect (if it is a python program you are writing), and spark sql can be done by spark.sql(...).

Is that what you want to do?

Only the spark code is executed in the cluster. Unfortunately!

-werners-
Esteemed Contributor III

dang, not even the spark.sql("...")?

As I mentioned earlier, only spark codes will be executed with Databricks connect. We have an internal feature request to access the Python REPL from the local IDE through DBconnect.

I don't know why but when I want to access that link I get this error: Unable to sign in I tried with the same email as here.

@Werner Stinckens​ you can execute spark.sql("...") in the cluster but I want to execute this for example:

collection = [1, 2, 3, 4, 5]
sum = 0
for x in collection:
     sum += x

stupid example!

Hi @Borislav Blagoev​ , you won't be able to access it. As I mentioned in the previous comment it's an internal feature request and only available for Databricks employees.

Oh, OK! I didn't understand that sorry!

hi @Borislav Blagoev​ 

Have you check the list of limitation for DB connect? docs here https://docs.databricks.com/dev-tools/databricks-connect.html#limitations

Limitations

The following Databricks features and third-party platforms are unsupported:

  1. Structured Streaming.
  2. Running arbitrary code that is not a part of a Spark job on the remote cluster.
  3. Native Scala, Python, and R APIs for Delta table operations (for example, DeltaTable.forPath) are not supported. However, the SQL API (spark.sql(...)) with Delta Lake operations and the Spark API (for example, spark.read.load) on Delta tables are both supported.

Yes, that's why I want to use something different than Databrcks Connect!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group