Databricks Runtime: 12.2 LTS, Spark: 3.3.2, Delta Lake: 2.2.0
A target table with schema ([c1: integer, c2: integer]), allows us to write into target table using data with schema ([c1: integer, c2: double]). I expected it to throw an exception (same as it does using normal spark write INSERT operation), but instead it stored the data with mismatched datatype for field c2.
from pyspark.sql.types import StructType, StructField, IntegerType, DoubleType
from delta import DeltaTable
# Source data
schema = StructType([StructField("c1", IntegerType(), False), StructField("c2", DoubleType(), False)])
rdd_output = spark.sparkContext.parallelize([(4, 1.4), (5, 5.0), (6, 3.5),])
df_source = spark.createDataFrame(rdd_output, schema=schema)
# write source to target table using merge
target_table = DeltaTable.forName(spark, "default.test_datatype_misalignment")
merge = target_table.alias("target").merge(df_source.alias("source"), "target.c1 = source.c1")
merge.whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()
spark.table("default.test_datatype_misalignment").show()
# OUTPUT
#+---+---+
#| c1| c2|
#+---+---+
#| 1| 1|
#| 2| 1|
#| 3| 5|
#| 4| 1|
#| 5| 5|
#| 6| 3|
#+---+---+
# write source to target table using insert
df_source.write.format("delta").mode("append").saveAsTable("default.test_datatype_misalignment")
# OUTPUT
#AnalysisException: Failed to merge fields 'c2' and 'c2'. Failed to merge incompatible data types IntegerType and DoubleType
I'am expecting an exception to be raised regardless of the write command, why is this not the case?