Topics with Label: Nested array struct dataframe

by Sameer_876675 • New Contributor III

12-07-2022 4:22:17 AM

5277 Views
3 replies
2 kudos

How to efficiently process a 100GiB JSON nested file and store it in Delta?

Hi, I'm a fairly new user and I am using Azure Databricks to process a ~1000GiB JSON nested file containing insurance policy data. I uploaded the JSON file to Azure Data Lake Gen2 storage and read the JSON file into a dataframe.df=spark.read.option("...

Data Engineering

5277 Views
3 replies
2 kudos

12-07-2022 4:22:17 AM

View Replies

Latest Reply

Annapurna_Hiriy
Databricks Employee

01-31-2023 8:20:49 AM

2 kudos

Hi Sameer, please refer to following documents on how to work with nested json:https://docs.databricks.com/optimizations/semi-structured.htmlhttps://learn.microsoft.com/en-us/azure/databricks/kb/_static/notebooks/scala/nested-json-to-dataframe.html

2 kudos

01-31-2023 8:20:49 AM

2 More Replies

by SatheeshSathees • New Contributor

08-19-2020 11:31:33 AM

7475 Views
1 replies
0 kudos

how to dynamically explode array type column in pyspark or scala

HI, i have a parquet file with complex column types with nested structs and arrays. I am using the scrpit from below link to flatten my parquet file. https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schema I am able ...

Data Engineering

7475 Views
1 replies
0 kudos

08-19-2020 11:31:33 AM

View Replies

Latest Reply

shyam_9
Databricks Employee

09-18-2020 12:39:35 PM

0 kudos

Hello, Please check out the below docs and notebook which has similar examples, https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schemahttps://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/transform-comple...

0 kudos

09-18-2020 12:39:35 PM

Databricks Community

Forum Posts

How to efficiently process a 100GiB JSON nested file and store it in Delta?

how to dynamically explode array type column in pyspark or scala