FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.

Andrei_Radulesc
Contributor III

I'm trying to get rid of the warning below:

/databricks/spark/python/pyspark/sql/context.py:117: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.

In my setup, I have a front-end notebook that gets parameters from the user, and that needs to create a dataframe based on those parameters. The code to create the dataframe is common with other notebooks, and is implemented in a regular python file:

def get_df(days_old: int = None) -> DataFrame:

   sc = SparkSession.builder.getOrCreate()

   sqlc = SQLContext(sc)

   df = sqlc.table(f"prod.some_schema.some_table")

return df

If I were to call sqlc.table() directly in the notebook, I would not have to create the Spark session and the SQL context. But if I call it from within a regular python file, I have to get the Spark session and SQL context - and I can't figure out how to do that w/o this FutureWarning.

Andrei_Radulesc
Contributor III

It just seems to me I already call SparkSession.builder.getOrCreate() , and still get the warning.

Andrei_Radulesc
Contributor III

That fixes it. Thanks. I need to do

spark = SparkSession.builder.getOrCreate()

df = spark.table("prod.some_schema.some_table")

instead of

sc = SparkSession.builder.getOrCreate()

   sqlc = SQLContext(sc)

   df = sqlc.table(f"prod.some_schema.some_table")

View solution in original post