07-06-2023 04:29 PM - edited 07-06-2023 04:32 PM
I get an,
`AnalysisException Failed to merge incompatible data types LongType and StringType
when attempting to run the below command,
`df.write.format("delta").saveAsTable("schema.k_adhoc.df", mode="overwrite")`
I am casting the column before saving:
```
The schema of the `df` dataframe is:
```
|-- id: string (nullable = false)
|-- year: integer (nullable = true)
|-- month: integer (nullable = true)
|-- month_year: string (nullable = true)
|-- mm-yyyy: date (nullable = true)
|-- h_id: string (nullable = true)
```
```
--------------------------------------------------------------------------- AnalysisException Traceback (most recent call last) File <command-3186243284832184>:2 1 # Save Table ----> 2 df.write.format("delta").saveAsTable("schema.k_adhoc.df", mode="overwrite") File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs) 46 start = time.perf_counter() 47 try: ---> 48 res = func(*args, **kwargs) 49 logger.log_success( 50 module_name, class_name, function_name, time.perf_counter() - start, signature 51 ) 52 return res File /databricks/spark/python/pyspark/sql/readwriter.py:1520, in DataFrameWriter.saveAsTable(self, name, format, mode, partitionBy, **options) 1518 if format is not None: 1519 self.format(format) -> 1520 self._jwrite.saveAsTable(name) File /databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py:1321, in JavaMember.__call__(self, *args) 1315 command = proto.CALL_COMMAND_NAME +\ 1316 self.command_header +\ 1317 args_command +\ 1318 proto.END_COMMAND_PART 1320 answer = self.gateway_client.send_command(command) -> 1321 return_value = get_return_value( 1322 answer, self.gateway_client, self.target_id, self.name) 1324 for temp_arg in temp_args: 1325 temp_arg._detach() File /databricks/spark/python/pyspark/errors/exceptions.py:234, in capture_sql_exception.<locals>.deco(*a, **kw) 230 converted = convert_exception(e.java_exception) 231 if not isinstance(converted, UnknownException): 232 # Hide where the exception came from that shows a non-Pythonic 233 # JVM exception message. --> 234 raise converted from None 235 else: 236 raise AnalysisException: Failed to merge fields 'h_id' and 'h_id'. Failed to merge incompatible data types LongType and StringType
```
07-06-2023 09:06 PM
Hi @kll
For reference-
https://docs.databricks.com/delta/delta-batch.html#schema-validation-1
07-07-2023 12:27 AM
Hi @kll
First, drop your table and delete the underlying files by using dbutils.fs.rm('path') and then try to run the write command then it will work.
07-13-2023 02:43 AM
Hi @kll
Hope everything is going great.
Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.
Cheers!
07-13-2023 04:37 AM
Use the overwrite schema option:
(df.write
.mode('overwrite')
.format('delta')
.option('overwriteSchema', 'true')
.saveAsTable('schema.k_adhoc.df'))
07-13-2023 08:29 AM
The issue seems to be because the job is trying to merge columns with different schema. Could you please make sure that the schema matches for the columns.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group