cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Schema change and OpenSearch

6502
New Contributor III

Let me be crystal clear: Schema Change and OpenSeach do not fit well together. However, the data pushed to it are processed and always have the same schema. The problem here is that Spark is reading a CDC feed, which is subject to Schema Change because the source table may be changed. 

I attempted to solve the issue by providing the mergeSchema and schemaTrackingLocation. I think these settings are useful to Spark for the checkpoint data. 

But it is not working, the code keeps failing with: 

com.databricks.sql.transaction.tahoe.DeltaStreamingColumnMappingSchemaIncompatibleException: Streaming read is not supported on tables with read-incompatible schema changes (e.g. rename or drop or datatype changes).
Please provide a 'schemaTrackingLocation' to enable non-additive schema evolution for Delta stream processing.
 
The above error is thrown for this schema change detection. Please note that the source table has a delta.columMapping on ID enabled.  This makes the diff larger, however, only a new field has been added in additive way. 
 
@@ -178,39 +249,64 @@
       "type": "integer",
       "nullable": true,
       "metadata": {
-        "comment": "Day extraction from `action_ts`"
+        "comment": "Day extraction from `action_ts`",
+        "delta.columnMapping.id": 27,
+        "delta.columnMapping.physicalName": "day"
       }
     },
     {
       "name": "merchant_shared_request_id",
       "type": "string",
       "nullable": true,
-      "metadata": {}
+      "metadata": {
+        "delta.columnMapping.id": 28,
+        "delta.columnMapping.physicalName": "merchant_shared_request_id"
+      }
     },
     {
       "name": "merchant_nsid",
       "type": "string",
       "nullable": true,
-      "metadata": {}
+      "metadata": {
+        "delta.columnMapping.id": 29,
+        "delta.columnMapping.physicalName": "merchant_nsid"
+      }
     },
     {
       "name": "refunded_on_behalf_of",
       "type": "string",
       "nullable": true,
-      "metadata": {}
-    },parse error: Invalid numeric literal at line 1, column 2833
-
+      "metadata": {
+        "delta.columnMapping.id": 30,
+        "delta.columnMapping.physicalName": "refunded_on_behalf_of"
+      }
+    },
     {
       "name": "payment_provider_to_merchant",
       "type": "string",
       "nullable": true,
-      "metadata": {}
+      "metadata": {
+        "delta.columnMapping.id": 31,
+        "delta.columnMapping.physicalName": "payment_provider_to_merchant"
+      }
     },
     {
       "name": "idempotency",
       "type": "string",
       "nullable": true,
-      "metadata": {}
+      "metadata": {
+        "delta.columnMapping.id": 34,
+        "delta.columnMapping.physicalName": "col-eeea8bdf-5e74-4088-8d9e-208fd9e55014"
+      }
+    },
+    {
+      "name": "payment_provider_operation_id",
+      "type": "string",
+      "nullable": true,
+      "metadata": {
+        "delta.columnMapping.id": 35,
+        "delta.columnMapping.physicalName": "col-a4b6a352-73cf-4af5-aae6-364c57d6a4cf"
+      }
     }
   ]
 }
 
I can handle the schema change manually, however, it will be much better in an automatic fashion.
Any idea?
0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now