- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2025 02:37 AM - edited 02-17-2025 02:38 AM
Thank you for the response. I didn't understand the command which you mentioned.
Here is the context where i'm facing this error:
I have folder on ADLS Gen2 with lot of sub folders on year/month/date/HH_MM_SS.mf4.
These file size range from 1GB to 14 GB.. so on.
Faced error when tried to convert the binaray content to dataframe.
Command:
mf4_df = spark.read.format("binaryFile") \
.option("pathGlobFilter", "*.mf4") \
.option("recursiveFileLookup", "true") \
.load("/mnt/adls_data/")
Result : mf4_df:pyspark.sql.connect.dataframe.DataFrame
path:string
modificationTime:timestamp
length:long
content:binary
Then used customer library "from asammdf import MDF" for converting binary content to Dataframe.
Thanks !