cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

What's the best way to develop Apache Spark Jobs from an IDE (such as IntelliJ/Pycharm)?

Anonymous
Not applicable

A number of people like developing locally using an IDE and then deploying. What are the recommended ways to do that with Databricks jobs?

1 REPLY 1

Anonymous
Not applicable

The Databricks Runtime and Apache Spark use the same base API. One can create Spark jobs that run locally and have them run on Databricks with all available Databricks features.

It is required that one uses SparkSession.builder.getOrCreate() to create the SparkSession. The SparkSession is created in the Databricks environment and is treated as a singleton.

In addition, one can also test using databricks connect. Databricks connect replaces Apache Spark/Pyspark on your local machine and allows for your local machine to execute jobs on a Databricks cluster. https://docs.databricks.com/dev-tools/databricks-connect.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group