I'm trying to get rid of the warning below:
/databricks/spark/python/pyspark/sql/context.py:117: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.
In my setup, I have a front-end notebook that gets parameters from the user, and that needs to create a dataframe based on those parameters. The code to create the dataframe is common with other notebooks, and is implemented in a regular python file:
def get_df(days_old: int = None) -> DataFrame:
sc = SparkSession.builder.getOrCreate()
sqlc = SQLContext(sc)
df = sqlc.table(f"prod.some_schema.some_table")
return df
If I were to call sqlc.table() directly in the notebook, I would not have to create the Spark session and the SQL context. But if I call it from within a regular python file, I have to get the Spark session and SQL context - and I can't figure out how to do that w/o this FutureWarning.