cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to instantiate Databricks spark context in a python script?

ae20cg
New Contributor III

I want to run a block of code in a script and not in a notebook on databricks, however I cannot properly instantiate the spark context without some error.

I have tried ` SparkContext.getOrCreate()`, but this does not work.

Is there a simple way to do this I am missing?

17 REPLIES 17

Hi @MichTalebzadeh ,

Once again, thanks for your replies.
My databricks cluster does come preinstalled with streamlit, and I have been running the script the way you mentioned.
I am going to try using alternatives to spark for the time being, and try with spark session isolation disabled as well.

I appreciate you taking out time to respond to this issue.

If you hit an issue related to an already existing Spark context, you can only have one SparkContext instance in a single JVM. In such cases, you can try the following approach

from pyspark import SparkContext
from pyspark.sql import SparkSession

# Check if a Spark context already exists
try:
    sc = SparkContext.getOrCreate()
    spark = SparkSession(sc)
    print("Using existing Spark context.")
except Exception as e:
    print("No existing Spark context found. Creating a new one.")
    sc = SparkContext()
    spark = SparkSession(sc)

 

Mich Talebzadeh | Technologist | Data | Generative AI | Financial Fraud
London
United Kingdom

view my Linkedin profile



https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner Von Braun)".

Thanks for your responses.
I did try creating a spark context and feed it to create a spark session constructor, but I get an error from databricks which states I should not initialize a new context despite using SparkContext.getOrCreate() Method.
Tried the following as well:

  1. SparkSession.getActiveSession() (returns null)

  2. getOrCreate method on SparkContext and SparkSession ( Asks for Master URL and app name, after which it states the databricks error: Should not intialize session or context in databricks if one already exists.)
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.