cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Merge Schema Error Message despite setting option to true

alexiswl
Contributor

Has anyone come across this error before:

```
A schema mismatch detected when writing to the Delta table (Table ID: d4b9c839-af0b-4b62-aab5-1072d3a0fa9d). To enable schema migration using DataFrameWriter or DataStreamWriter, please set: '.option("mergeSchema", "true")'
```

But the code used was

```python
df_ps.write.saveAsTable(
   
"jobs.dracarys_ingested",
    mode="append",
    options={
     
"mergeSchema": "true"
   
}
)
```

Did I not use the option correctly?

Running on 13.1 (includes Apache Spark 3.4.0, Scala 2.12)

Full traceback here:

```pytb

---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
File <command-1565101189640444>:52
35 df_ps = spark.createDataFrame(
36 pd.DataFrame(
37 [
(...)
48 )
49 )
51 # Save to list of workflows ingested
---> 52 df_ps.write.saveAsTable(
53 "jobs.dracarys_ingested",
54 mode="append",
55 options={
56 "mergeSchema": "true"
57 }
58 )
60 # Delete row from pending notebook
61 sql(
62 f"DELETE FROM jobs.dracarys_to_ingest "
63 f"WHERE batch_id == {batch_id} AND wfr_id == '{wfr_id}' AND portal_run_id == '{portal_run_id}'"
64 )

File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs)
46 start = time.perf_counter()
47 try:
---> 48 res = func(*args, **kwargs)
49 logger.log_success(
50 module_name, class_name, function_name, time.perf_counter() - start, signature
51 )
52 return res

File /databricks/spark/python/pyspark/sql/readwriter.py:1576, in DataFrameWriter.saveAsTable(self, name, format, mode, partitionBy, **options)
1574 if format is not None:
1575 self.format(format)
-> 1576 self._jwrite.saveAsTable(name)

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):

File /databricks/spark/python/pyspark/errors/exceptions/captured.py:191, in capture_sql_exception.<locals>.deco(*a, **kw)
187 converted = convert_exception(e.java_exception)
188 if not isinstance(converted, UnknownException):
189 # Hide where the exception came from that shows a non-Pythonic
190 # JVM exception message.
--> 191 raise converted from None
192 else:
193 raise

AnalysisException: A schema mismatch detected when writing to the Delta table (Table ID: d4b9c839-af0b-4b62-aab5-1072d3a0fa9d).
To enable schema migration using DataFrameWriter or DataStreamWriter, please set:
'.option("mergeSchema", "true")'.
For other operations, set the session configuration
spark.databricks.delta.schema.autoMerge.enabled to "true". See the documentation
specific to the operation for details.

Table schema:
root
-- batch_id: string (nullable = true)
-- wfr_id: string (nullable = true)
-- type_name: string (nullable = true)
-- wfr_step_name: void (nullable = true)
-- portal_run_id: string (nullable = true)
-- date_ingested: date (nullable = true)
-- dracarys_version: string (nullable = true)


Data schema:
root
-- batch_id: string (nullable = true)
-- wfr_id: string (nullable = true)
-- type_name: string (nullable = true)
-- wfr_step_name: string (nullable = true)
-- portal_run_id: string (nullable = true)
-- date_ingested: date (nullable = true)
-- dracarys_version: string (nullable = true)


```



 

1 ACCEPTED SOLUTION

Accepted Solutions

alexiswl
Contributor

Don't worry team, I figured it out! 

Rather than using the options kwargs, instead you need to use the .option method before writing the table, like below

```

df_ps.write.option(
  "mergeSchema", "true"
).saveAsTable(
  "jobs.dracarys_to_ingest",
  mode="append"
)
```

View solution in original post

3 REPLIES 3

alexiswl
Contributor

Don't worry team, I figured it out! 

Rather than using the options kwargs, instead you need to use the .option method before writing the table, like below

```

df_ps.write.option(
  "mergeSchema", "true"
).saveAsTable(
  "jobs.dracarys_to_ingest",
  mode="append"
)
```

Anonymous
Not applicable

Hi @alexiswl 

Share the wisdom! By marking the best answers, you help others in our community find valuable information quickly and efficiently.

Thanks!

Hi Vidula, 

I will accept the answer above. Thought it would be a bit odd to give my self kudos though?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.