Data Engineering

by AzureDatabricks • New Contributor III

11-21-2021 11:34:20 PM

7107 Views
5 replies
1 kudos

Parallel processing of json files in databricks pyspark

How we can read files from azure blob storage and process parallel in databricks using pyspark.As of now we are reading all 10 files at a time into dataframe and flattening it.Thanks & Regards,Sujata

Data Engineering

7107 Views
5 replies
1 kudos

11-21-2021 11:34:20 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

11-22-2021 1:54:07 AM

1 kudos

spark.read.json("/mnt/dbfs/<ENTER PATH OF JSON DIR HERE>/*.jsonyou first have to mount your blob storage to databricks, I assume that is already done.https://spark.apache.org/docs/latest/sql-data-sources-json.html

1 kudos

11-22-2021 1:54:07 AM

4 More Replies

Databricks Community

Forum Posts

Parallel processing of json files in databricks pyspark