cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Sameer_876675
by New Contributor III
  • 2600 Views
  • 3 replies
  • 2 kudos

How to efficiently process a 100GiB JSON nested file and store it in Delta?

Hi, I'm a fairly new user and I am using Azure Databricks to process a ~1000GiB JSON nested file containing insurance policy data. I uploaded the JSON file to Azure Data Lake Gen2 storage and read the JSON file into a dataframe.df=spark.read.option("...

Cluster Summary OOM Error
  • 2600 Views
  • 3 replies
  • 2 kudos
Latest Reply
Annapurna_Hiriy
New Contributor III
  • 2 kudos

Hi Sameer, please refer to following documents on how to work with nested json:https://docs.databricks.com/optimizations/semi-structured.htmlhttps://learn.microsoft.com/en-us/azure/databricks/kb/_static/notebooks/scala/nested-json-to-dataframe.html

  • 2 kudos
2 More Replies
SatheeshSathees
by New Contributor
  • 5547 Views
  • 1 replies
  • 0 kudos

how to dynamically explode array type column in pyspark or scala

HI, i have a parquet file with complex column types with nested structs and arrays. I am using the scrpit from below link to flatten my parquet file. https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schema I am able ...

  • 5547 Views
  • 1 replies
  • 0 kudos
Latest Reply
shyam_9
Valued Contributor
  • 0 kudos

Hello, Please check out the below docs and notebook which has similar examples, https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schemahttps://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/transform-comple...

  • 0 kudos
Labels