Re: How to instantiate Databricks spark context in...

MichTalebzadeh · ‎03-13-2024

If you hit an issue related to an already existing Spark context, you can only have one SparkContext instance in a single JVM. In such cases, you can try the following approach

from pyspark import SparkContext
from pyspark.sql import SparkSession

# Check if a Spark context already exists
try:
    sc = SparkContext.getOrCreate()
    spark = SparkSession(sc)
    print("Using existing Spark context.")
except Exception as e:
    print("No existing Spark context found. Creating a new one.")
    sc = SparkContext()
    spark = SparkSession(sc)

Mich Talebzadeh | Technologist | Data | Generative AI | Financial Fraud
London
United Kingdom

view my Linkedin profile

https://en.everybodywiki.com/Mich_Talebzadeh

Disclaimer: The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner Von Braun)".

Spartan101 · ‎03-13-2024

Thanks for your responses.
I did try creating a spark context and feed it to create a spark session constructor, but I get an error from databricks which states I should not initialize a new context despite using SparkContext.getOrCreate() Method.
Tried the following as well:

SparkSession.getActiveSession() (returns null)
getOrCreate method on SparkContext and SparkSession ( Asks for Master URL and app name, after which it states the databricks error: Should not intialize session or context in databricks if one already exists.)

ayush007 · ‎10-03-2024

Is there some solution for this.We got struck where a cluster having unity catalog is not able to get spark context.This is not allowing to use distributed nature of spark in databricks.