Updating dazfuller suggestion, but including code for one level of partitioning, of course if you have deeper partitions then you will have to make a function and do a recursive call to get to the final directory containing parquet files. Parquet wil...
A temp view is a pointer. The information for a temp view is stored in the spark catalogYou can drop a temp view withspark.catalog.dropTempView("view_name")You could also drop a temp view in a sql cell withDROP TABLE "temp_view_name"Here is some code...
My point was that you are asking for column names from what you consider to be the "first row" and I am telling you that at scale, or if the data volume grows what you consider to be the "first row" may no longer actually be the "first row" unless ...
Unless the dataframe is sorted, "first row" is not guaranteed to be consistent.
I can see about working up some code to do this, it probably is fairly straightforward.
However, if you can specify header = True when reading, the data then this prob...