- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2023 04:30 AM
I'm trying to get rid of the warning below:
/databricks/spark/python/pyspark/sql/context.py:117: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.
In my setup, I have a front-end notebook that gets parameters from the user, and that needs to create a dataframe based on those parameters. The code to create the dataframe is common with other notebooks, and is implemented in a regular python file:
def get_df(days_old: int = None) -> DataFrame:
sc = SparkSession.builder.getOrCreate()
sqlc = SQLContext(sc)
df = sqlc.table(f"prod.some_schema.some_table")
return df
If I were to call sqlc.table() directly in the notebook, I would not have to create the Spark session and the SQL context. But if I call it from within a regular python file, I have to get the Spark session and SQL context - and I can't figure out how to do that w/o this FutureWarning.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-22-2023 04:32 AM
It just seems to me I already call SparkSession.builder.getOrCreate() , and still get the warning.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2023 01:26 PM
That fixes it. Thanks. I need to do
spark = SparkSession.builder.getOrCreate()
df = spark.table("prod.some_schema.some_table")
instead of
sc = SparkSession.builder.getOrCreate()
sqlc = SQLContext(sc)
df = sqlc.table(f"prod.some_schema.some_table")