BUG - withColumns in pyspark doesn't handle empty dictionary
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-30-2025 10:58 PM
Today, while reading a delta load my notebook failed and I wanted to report a bug. The withColumns command does not tolerate an empty dictionary and gives the following error in PySpark.
flat_tuple = namedtuple("flat_tuple", ["old_col", "new_col", "logic"])
# flat_tuple(old_col, new_col, logic)
flat_tuples = [
flat_tuple("Coordinates", "Coordinates", extract_coordinates_udf(col("Coordinates")["coordinates"]))
, flat_tuple("CreatedById", "CreatedById", col("CreatedById")["$oid"])
, flat_tuple("CreationDate", "CreationDate", col("CreationDate")["$date"]["$numberLong"])
, flat_tuple("Names", "Names", col("Names")[0]["LanguageValue"])
, flat_tuple("Location", "LocationCoordinates", extract_coordinates_udf(col("Location")["coordinates"]))
, flat_tuple("Location", "LocationType", col("Location")["type"])
, flat_tuple("_id", "sectorId", col("_id")["$oid"])
]
final_flat_cols = {tup.new_col: tup.logic for tup in flat_tuples if tup.old_col in df.columns}
df = df.withColumns(final_flat_cols)
-- Output
AssertionError: [Trace ID: 00-68d8e7cacb471da60efe65d0ef17703d-a3b270f251715df4-00]
This case is handled in normal PySpark and I don't want to write a special if-else clause to check for the columns of dataframe before running withColumns. It would be great if it could be handled internally.
Currently, I'm using the following to handle this
flat_col_lst = [tup.logic.alias(tup.new_col) for tup in flat_tuples if tup.old_col in df.columns]
df = df.select('*', *flat_col_lst)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2025 03:54 AM
Hello @Dhruv-22 ,
I have tested this internally, and this seems to be a bug with the new Serverless env version 4
As a solution, you can try switching the version to 3 as shown bleow and re-run the above code, and it should work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2025 05:07 AM
Hey @K_Anudeep
I tried using Environment Version 3, 2, and 1 but still got the same error. Attached is a screenshot with version 3.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2025 05:15 AM
Hey @Dhruv-22
Did you apply the version and create a new session/clear the existing session before running it? It should work on Env version 3 as mentioned in my repro below.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2025 05:33 AM
Yeah, I created a new session. I tried it 3-4 times.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2025 07:48 AM
Sure! let me try once again and get back
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-01-2025 06:18 AM
Hey @K_Anudeep, did you get anything?