cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to instantiate Databricks spark context in a python script?

ae20cg
New Contributor III

I want to run a block of code in a script and not in a notebook on databricks, however I cannot properly instantiate the spark context without some error.

I have tried ` SparkContext.getOrCreate()`, but this does not work.

Is there a simple way to do this I am missing?

17 REPLIES 17

If you hit an issue related to an already existing Spark context, you can only have one SparkContext instance in a single JVM. In such cases, you can try the following approach

from pyspark import SparkContext
from pyspark.sql import SparkSession

# Check if a Spark context already exists
try:
    sc = SparkContext.getOrCreate()
    spark = SparkSession(sc)
    print("Using existing Spark context.")
except Exception as e:
    print("No existing Spark context found. Creating a new one.")
    sc = SparkContext()
    spark = SparkSession(sc)

 

Mich Talebzadeh | Technologist | Data | Generative AI | Financial Fraud
London
United Kingdom

view my Linkedin profile



https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner Von Braun)".

Thanks for your responses.
I did try creating a spark context and feed it to create a spark session constructor, but I get an error from databricks which states I should not initialize a new context despite using SparkContext.getOrCreate() Method.
Tried the following as well:

  1. SparkSession.getActiveSession() (returns null)

  2. getOrCreate method on SparkContext and SparkSession ( Asks for Master URL and app name, after which it states the databricks error: Should not intialize session or context in databricks if one already exists.)

ayush007
New Contributor II

Is there some solution for this.We got struck where a cluster having unity catalog is not able to get spark context.This is not allowing to use distributed nature of spark in databricks.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group