Parallel processing of json files in databricks pyspark
How we can read files from azure blob storage and process parallel in databricks using pyspark.As of now we are reading all 10 files at a time into dataframe and flattening it.Thanks & Regards,Sujata
- 6510 Views
- 5 replies
- 1 kudos
Latest Reply
spark.read.json("/mnt/dbfs/<ENTER PATH OF JSON DIR HERE>/*.jsonyou first have to mount your blob storage to databricks, I assume that is already done.https://spark.apache.org/docs/latest/sql-data-sources-json.html
- 1 kudos