cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Cannot reserve additional contiguous bytes in the vectorized reader (requested xxxxxxxxx bytes).

shan_chandra
Databricks Employee
Databricks Employee

I got the below error when running a streaming workload from a source Delta table 

Caused by: java.lang.RuntimeException: Cannot reserve additional contiguous bytes in the vectorized reader (requested xxxxxxxxx bytes). As a workaround, you can reduce the vectorized reader batch size, or disable the vectorized reader, or disable spark.sql.sources.bucketing.enabled if you read from bucket table. For Parquet file format, refer to spark.sql.parquet.columnarReaderBatchSize (default 4096) and spark.sql.parquet.enableVectorizedReader; for ORC file format, refer to spark.sql.orc.columnarReaderBatchSize (default 4096) and spark.sql.orc.enableVectorizedReader

could you please let us know how to mitigate the issue?

1 ACCEPTED SOLUTION

Accepted Solutions

shan_chandra
Databricks Employee
Databricks Employee

This is happening because the delta/parquet source has one or more of the following:

  1. a huge number of columns
  2. huge strings in one or more columns
  3. huge arrays/map, possibly nested in each other

In order to mitigate this issue, could you please reduce spark.sql.parquet.columnarReaderBatchSize from default value - 4096 ?

View solution in original post

1 REPLY 1

shan_chandra
Databricks Employee
Databricks Employee

This is happening because the delta/parquet source has one or more of the following:

  1. a huge number of columns
  2. huge strings in one or more columns
  3. huge arrays/map, possibly nested in each other

In order to mitigate this issue, could you please reduce spark.sql.parquet.columnarReaderBatchSize from default value - 4096 ?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group