cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks External table row maximum size

Mano99
New Contributor II

Hi Databricks Team/ Community,

We have created a Databricks External table on top of ADLS Gen 2. Both parquet and delta tables. we are loading nested json structure into a table. Few column will have huge nested json data. Im getting results too large error. But transformations and others are working fine. Only i cant able to display it. 

Here, What i want to know is what is the maximum size(in terms of MB or GB) per row databricks can accept or it can store.
Saw some references in google and AI, they are saying upto 2.5GB. Is it true? If anyone knows the  exact number, please help here. And leave a comments on above known issue to understand better.

Thanks & Regards,
Manohar G 

 

1 ACCEPTED SOLUTION

Accepted Solutions

dennis65
New Contributor II

@Mano99 ktagwrote:

Hi Databricks Team/ Community,

We have created a Databricks External table on top of ADLS Gen 2. Both parquet and delta tables. we are loading nested json structure into a table. Few column will have huge nested json data. Im getting results too large error. But transformations and others are working fine. Only i cant able to display it. 

Here, What i want to know is what is the maximum size(in terms of MB or GB) per row databricks can accept or it can store.
Saw some references in google and AI, they are saying upto 2.5GB. Is it true? If anyone knows the  exact number, please help here. And leave a comments on above known issue to understand better.

Thanks & Regards,
Manohar G 

 


Databricks/Spark can generally store rows up to around 2-2.5 GB, a practical limit due to underlying data structures. However, the "results too large" error you're seeing is a limitation on the driver node's ability to *display* large result sets, especially with huge nested JSON columns. To resolve this, avoid displaying the entire table directly; instead, use `.limit()`, filter for specific rows, project only necessary columns, sample the data, or write the data to a file for external analysis. The storage limit is separate from the display limitation.

View solution in original post

2 REPLIES 2

dennis65
New Contributor II

@Mano99 ktagwrote:

Hi Databricks Team/ Community,

We have created a Databricks External table on top of ADLS Gen 2. Both parquet and delta tables. we are loading nested json structure into a table. Few column will have huge nested json data. Im getting results too large error. But transformations and others are working fine. Only i cant able to display it. 

Here, What i want to know is what is the maximum size(in terms of MB or GB) per row databricks can accept or it can store.
Saw some references in google and AI, they are saying upto 2.5GB. Is it true? If anyone knows the  exact number, please help here. And leave a comments on above known issue to understand better.

Thanks & Regards,
Manohar G 

 


Databricks/Spark can generally store rows up to around 2-2.5 GB, a practical limit due to underlying data structures. However, the "results too large" error you're seeing is a limitation on the driver node's ability to *display* large result sets, especially with huge nested JSON columns. To resolve this, avoid displaying the entire table directly; instead, use `.limit()`, filter for specific rows, project only necessary columns, sample the data, or write the data to a file for external analysis. The storage limit is separate from the display limitation.

Mano99
New Contributor II

Hi Denni, Just to cross confirm again. 2-2.5GB is per row right?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now