cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

youssefmrini
by Databricks Employee
  • 1920 Views
  • 1 replies
  • 2 kudos
  • 1920 Views
  • 1 replies
  • 2 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 2 kudos

Clone can now be used to create and incrementally update Delta tables that mirror Apache Parquet and Apache Iceberg tables. You can update your source Parquet table and incrementally apply the changes to their cloned Delta table with the clone comman...

  • 2 kudos
ivanychev
by Contributor II
  • 14082 Views
  • 14 replies
  • 5 kudos

toPandas() causes IndexOutOfBoundsException in Apache Arrow

Using DBR 10.0When calling toPandas() the worker fails with IndexOutOfBoundsException. It seems like ArrowWriter.sizeInBytes (which looks like a proprietary method since I can't find it in OSS) calls arrow's getBufferSizeFor which fails with this err...

  • 14082 Views
  • 14 replies
  • 5 kudos
Latest Reply
vikas_ahlawat
New Contributor II
  • 5 kudos

I am also facing the same issue, I have applied the config: `spark.sql.execution.arrow.pyspark.enabled` set to `false`, but still facing the same issue. Any Idea, what's going on???. Please help me out....org.apache.spark.SparkException: Job aborted ...

  • 5 kudos
13 More Replies
sarvesh
by Contributor III
  • 3319 Views
  • 5 replies
  • 8 kudos

Catch rejected Data ( Rows ) while reading with Apache-Spark.

I work with Spark-Scala and I receive Data in different formats ( .csv/.xlxs/.txt etc ), when I try to read/write this data from different sources to a any database, many records got rejected due to various issues like (special characters, data type ...

  • 3319 Views
  • 5 replies
  • 8 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 8 kudos

or maybe schema evolution on delta lake is enough, in combination with Hubert's answer

  • 8 kudos
4 More Replies
Labels