cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Merge Schema Error Message despite setting option to true

alexiswl
Contributor

Has anyone come across this error before:

```
A schema mismatch detected when writing to the Delta table (Table ID: d4b9c839-af0b-4b62-aab5-1072d3a0fa9d). To enable schema migration using DataFrameWriter or DataStreamWriter, please set: '.option("mergeSchema", "true")'
```

But the code used was

```python
df_ps.write.saveAsTable(
   
"jobs.dracarys_ingested",
    mode="append",
    options={
     
"mergeSchema": "true"
   
}
)
```

Did I not use the option correctly?

Running on 13.1 (includes Apache Spark 3.4.0, Scala 2.12)

Full traceback here:

```pytb

---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
File <command-1565101189640444>:52
35 df_ps = spark.createDataFrame(
36 pd.DataFrame(
37 [
(...)
48 )
49 )
51 # Save to list of workflows ingested
---> 52 df_ps.write.saveAsTable(
53 "jobs.dracarys_ingested",
54 mode="append",
55 options={
56 "mergeSchema": "true"
57 }
58 )
60 # Delete row from pending notebook
61 sql(
62 f"DELETE FROM jobs.dracarys_to_ingest "
63 f"WHERE batch_id == {batch_id} AND wfr_id == '{wfr_id}' AND portal_run_id == '{portal_run_id}'"
64 )

File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.<locals>.wrapper(*args, **kwargs)
46 start = time.perf_counter()
47 try:
---> 48 res = func(*args, **kwargs)
49 logger.log_success(
50 module_name, class_name, function_name, time.perf_counter() - start, signature
51 )
52 return res

File /databricks/spark/python/pyspark/sql/readwriter.py:1576, in DataFrameWriter.saveAsTable(self, name, format, mode, partitionBy, **options)
1574 if format is not None:
1575 self.format(format)
-> 1576 self._jwrite.saveAsTable(name)

File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):

File /databricks/spark/python/pyspark/errors/exceptions/captured.py:191, in capture_sql_exception.<locals>.deco(*a, **kw)
187 converted = convert_exception(e.java_exception)
188 if not isinstance(converted, UnknownException):
189 # Hide where the exception came from that shows a non-Pythonic
190 # JVM exception message.
--> 191 raise converted from None
192 else:
193 raise

AnalysisException: A schema mismatch detected when writing to the Delta table (Table ID: d4b9c839-af0b-4b62-aab5-1072d3a0fa9d).
To enable schema migration using DataFrameWriter or DataStreamWriter, please set:
'.option("mergeSchema", "true")'.
For other operations, set the session configuration
spark.databricks.delta.schema.autoMerge.enabled to "true". See the documentation
specific to the operation for details.

Table schema:
root
-- batch_id: string (nullable = true)
-- wfr_id: string (nullable = true)
-- type_name: string (nullable = true)
-- wfr_step_name: void (nullable = true)
-- portal_run_id: string (nullable = true)
-- date_ingested: date (nullable = true)
-- dracarys_version: string (nullable = true)


Data schema:
root
-- batch_id: string (nullable = true)
-- wfr_id: string (nullable = true)
-- type_name: string (nullable = true)
-- wfr_step_name: string (nullable = true)
-- portal_run_id: string (nullable = true)
-- date_ingested: date (nullable = true)
-- dracarys_version: string (nullable = true)


```



 

1 ACCEPTED SOLUTION

Accepted Solutions

alexiswl
Contributor

Don't worry team, I figured it out! 

Rather than using the options kwargs, instead you need to use the .option method before writing the table, like below

```

df_ps.write.option(
  "mergeSchema", "true"
).saveAsTable(
  "jobs.dracarys_to_ingest",
  mode="append"
)
```

View solution in original post

3 REPLIES 3

alexiswl
Contributor

Don't worry team, I figured it out! 

Rather than using the options kwargs, instead you need to use the .option method before writing the table, like below

```

df_ps.write.option(
  "mergeSchema", "true"
).saveAsTable(
  "jobs.dracarys_to_ingest",
  mode="append"
)
```

Anonymous
Not applicable

Hi @alexiswl 

Share the wisdom! By marking the best answers, you help others in our community find valuable information quickly and efficiently.

Thanks!

Hi Vidula, 

I will accept the answer above. Thought it would be a bit odd to give my self kudos though?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group