cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

youssefmrini
by Honored Contributor III
  • 1494 Views
  • 1 replies
  • 2 kudos
  • 1494 Views
  • 1 replies
  • 2 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 2 kudos

Clone can now be used to create and incrementally update Delta tables that mirror Apache Parquet and Apache Iceberg tables. You can update your source Parquet table and incrementally apply the changes to their cloned Delta table with the clone comman...

  • 2 kudos
ivanychev
by Contributor
  • 6663 Views
  • 16 replies
  • 5 kudos

toPandas() causes IndexOutOfBoundsException in Apache Arrow

Using DBR 10.0When calling toPandas() the worker fails with IndexOutOfBoundsException. It seems like ArrowWriter.sizeInBytes (which looks like a proprietary method since I can't find it in OSS) calls arrow's getBufferSizeFor which fails with this err...

  • 6663 Views
  • 16 replies
  • 5 kudos
Latest Reply
vikas_ahlawat
New Contributor II
  • 5 kudos

I am also facing the same issue, I have applied the config: `spark.sql.execution.arrow.pyspark.enabled` set to `false`, but still facing the same issue. Any Idea, what's going on???. Please help me out....org.apache.spark.SparkException: Job aborted ...

  • 5 kudos
15 More Replies
sarvesh
by Contributor III
  • 2769 Views
  • 5 replies
  • 8 kudos

Catch rejected Data ( Rows ) while reading with Apache-Spark.

I work with Spark-Scala and I receive Data in different formats ( .csv/.xlxs/.txt etc ), when I try to read/write this data from different sources to a any database, many records got rejected due to various issues like (special characters, data type ...

  • 2769 Views
  • 5 replies
  • 8 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 8 kudos

or maybe schema evolution on delta lake is enough, in combination with Hubert's answer

  • 8 kudos
4 More Replies
Labels