Databricks Community

kll · ‎07-06-2023

I get an,

`AnalysisException Failed to merge incompatible data types LongType and StringType

when attempting to run the below command,

`df.write.format("delta").saveAsTable("schema.k_adhoc.df", mode="overwrite")`

I am casting the column before saving:

```

# Cast datatype

df = df.withColumn("h_id", df4["h_id"].cast(StringType()))

df.show()

```

The schema of the `df` dataframe is:

```

|-- id: string (nullable = false)

|-- year: integer (nullable = true)

|-- month: integer (nullable = true)

|-- month_year: string (nullable = true)

|-- mm-yyyy: date (nullable = true)

|-- h_id: string (nullable = true)

```

--------------------------------------------------------------------------- AnalysisException Traceback (most recent call last) File <command-3186243284832184>:2 1 # Save Table ----> 2 df.write.format("delta").saveAsTable("schema.k_adhoc.df", mode="overwrite") File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs) 46 start = time.perf_counter() 47 try: ---> 48 res = func(*args, **kwargs) 49 logger.log_success( 50 module_name, class_name, function_name, time.perf_counter() - start, signature 51 ) 52 return res File /databricks/spark/python/pyspark/sql/readwriter.py:1520, in DataFrameWriter.saveAsTable(self, name, format, mode, partitionBy, **options) 1518 if format is not None: 1519 self.format(format) -> 1520 self._jwrite.saveAsTable(name) File /databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py:1321, in JavaMember.__call__(self, *args) 1315 command = proto.CALL_COMMAND_NAME +\ 1316 self.command_header +\ 1317 args_command +\ 1318 proto.END_COMMAND_PART 1320 answer = self.gateway_client.send_command(command) -> 1321 return_value = get_return_value( 1322 answer, self.gateway_client, self.target_id, self.name) 1324 for temp_arg in temp_args: 1325 temp_arg._detach() File /databricks/spark/python/pyspark/errors/exceptions.py:234, in capture_sql_exception.<locals>.deco(*a, **kw) 230 converted = convert_exception(e.java_exception) 231 if not isinstance(converted, UnknownException): 232 # Hide where the exception came from that shows a non-Pythonic 233 # JVM exception message. --> 234 raise converted from None 235 else: 236 raise AnalysisException: Failed to merge fields 'h_id' and 'h_id'. Failed to merge incompatible data types LongType and StringType

```

Priyanka_Biswas · ‎07-06-2023

Hi @kll

This error is because of schema validation. The delta table performs schema validation of every column, and the source dataframe column data types must match the column data types in the target table. If they don’t match, an exception is raised.

For reference-

https://docs.databricks.com/delta/delta-batch.html#schema-validation-1

Could you please refer below 2 article and see if it helps ?

Ajay-Pandey · ‎07-07-2023

Hi @kll

First, drop your table and delete the underlying files by using dbutils.fs.rm('path') and then try to run the write command then it will work.

Ajay Kumar Pandey

Anonymous · ‎07-13-2023

Hi @kll

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.

Cheers!

dream · ‎07-13-2023

Use the overwrite schema option:

(df.write
    .mode('overwrite')
    .format('delta')
    .option('overwriteSchema', 'true')
    .saveAsTable('schema.k_adhoc.df'))

Lakshay · ‎07-13-2023

The issue seems to be because the job is trying to merge columns with different schema. Could you please make sure that the schema matches for the columns.

Databricks Community

AnalysisException : when attempting to save a spark DataFrame as delta table

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Share Your Feedback in Our Community Survey

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks