2 weeks ago
Hi .
i have a source table that is a delta live streaming table created using dlt.auto_cdc logic and now i want to create another sreaming table that filters the record from that table as per the client but it also should have auto cdc logic for the incremental logic . i tried doing that using materialized view but it refreshed fully instead of incremental . so i want to created client specific table but it gives me this error .
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 80319c3f-5654-4089-9a10-ecea0180cf09, runId = c6d93065-56a6-443d-b834-9e767e111e12] terminated with exception: [DELTA_SOURCE_TABLE_IGNORE_CHANGES] Detected a data update (for example WRITE (Map(mode -> Overwrite, statsOnLoad -> false))) in the source table at version 3. This is currently not supported. If this is going to happen regularly and you are okay to skip changes, set the option 'skipChangeCommits' to 'true'. If you would like the data update to be reflected, please restart this query with a fresh checkpoint directory or do a full refresh if you are using DLT. If you need to handle these changes, please switch to MVs. The source table can be found at path abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/2b764114-c9ed-459a-a54a-e68c77a6f6af. SQLSTATE: XXKST
=== Streaming Query ===
Identifier: taps.india.__materialization_mat_421df95e_f7b4_46a9_a8dd_ac09e7cf071b_customer_india11_temp_1 [id = 80319c3f-5654-4089-9a10-ecea0180cf09, runId = c6d93065-56a6-443d-b834-9e767e111e12]
Current Start Offsets: {DeltaSource[abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/2b764114-c9ed-459a-a54a-e68c77a6f6af]: {"sourceVersion":1,"reservoirId":"b31124cc-7c53-4865-bdb5-ec0300cd6ef8","reservoirVersion":3,"index":-1,"isStartingVersion":false}}
Current End Offsets: {DeltaSource[abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/2b764114-c9ed-459a-a54a-e68c77a6f6af]: {"sourceVersion":1,"reservoirId":"b31124cc-7c53-4865-bdb5-ec0300cd6ef8","reservoirVersion":3,"index":-1,"isStartingVersion":false}}
Current State: ACTIVE
Thread State: RUNNABLE
Logical Plan:
~WriteToMicroBatchDataSourceV1 DeltaSink[abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/aa64aa94-44ce-4803-b1e4-63ec3537bc95], 80319c3f-5654-4089-9a10-ecea0180cf09, [path=abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/aa64aa94-44ce-4803-b1e4-63ec3537bc95, queryName=taps.india.__materialization_mat_421df95e_f7b4_46a9_a8dd_ac09e7cf071b_customer_india11_temp_1, checkpointLocation=abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/aa64aa94-44ce-4803-b1e4-63ec3537bc95/_dlt_metadata/checkpoints/taps.india.customer_india11_temp/0], Append
+- ~CollectMetrics pipelines.expectations.taps.india.customer_india11_temp, [count(1) AS total#1880L, count(CASE WHEN false THEN 1 END) AS dropped#1881L, count(CASE WHEN false THEN 1 END) AS allowed#1882L], 186
+- ~Project [customer_id#1723, name#1724, email#1725, address#1726, event_time#1727, country#1728, _rescued_data#1729]
+- ~StreamingExecutionRelation DeltaSource[abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/2b764114-c9ed-459a-a54a-e68c77a6f6af], [__enzyme__row__id__#1722, customer_id#1723, name#1724, email#1725, address#1726, event_time#1727, country#1728, _rescued_data#1729]
at org.apache.spark.sql.execution.streaming.StreamExecution.$anonfun$runStream$1(StreamExecution.scala:554)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:49)
at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:295)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:291)
at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:47)
at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:44)
at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:30)
at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:96)
at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:77)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:30)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags0(DatabricksSparkUsageLogger.scala:91)
at com.databricks.spark.util.DatabricksSparkUsageLogger.withAttributionTags(DatabricksSparkUsageLogger.scala:195)
at com.databricks.spark.util.UsageLogging.$anonfun$withAttributionTags$1(UsageLogger.scala:668)
at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:780)
at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:789)
at com.databricks.spark.util.UsageLogging.withAttributionTags(UsageLogger.scala:668)
at com.databricks.spark.util.UsageLogging.withAttributionTags$(UsageLogger.scala:666)
at org.apache.spark.sql.execution.streaming.StreamExecution.withAttributionTags(StreamExecution.scala:87)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:383)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.$anonfun$run$3(StreamExecution.scala:286)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:97)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.$anonfun$run$2(StreamExecution.scala:286)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:285)
com.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: [DELTA_SOURCE_TABLE_IGNORE_CHANGES] Detected a data update (for example WRITE (Map(mode -> Overwrite, statsOnLoad -> false))) in the source table at version 3. This is currently not supported. If this is going to happen regularly and you are okay to skip changes, set the option 'skipChangeCommits' to 'true'. If you would like the data update to be reflected, please restart this query with a fresh checkpoint directory or do a full refresh if you are using DLT. If you need to handle these changes, please switch to MVs. The source table can be found at path abfss://unity-catalog-storage@dbstoragevyadqj5lvd744.dfs.core.windows.net/2998048117548069/__unitystorage/catalogs/78873f87-08d4-40bc-8256-611e6c893ef7/tables/2b764114-c9ed-459a-a54a-e68c77a6f6af.
at com.databricks.sql.transaction.tahoe.DeltaErrorsBase.deltaSourceIgnoreChangesError(DeltaErrors.scala:192)
at com.databricks.sql.transaction.tahoe.DeltaErrorsBase.deltaSourceIgnoreChangesError$(DeltaErrors.scala:186)
at com.databricks.sql.transaction.tahoe.DeltaErrors$.deltaSourceIgnoreChangesError(DeltaErrors.scala:3734)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.validateCommitAndDecideSkipping(DeltaSource.scala:1140)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.$anonfun$getFileChanges$3(DeltaSource.scala:838)
at com.databricks.sql.transaction.tahoe.storage.ClosableIterator.processAndClose(ClosableIterator.scala:33)
at com.databricks.sql.transaction.tahoe.storage.ClosableIterator.processAndClose$(ClosableIterator.scala:31)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource$$anon$3.processAndClose(DeltaSource.scala:1670)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.$anonfun$getFileChanges$2(DeltaSource.scala:833)
at com.databricks.sql.transaction.tahoe.storage.ClosableIterator$IteratorFlatMapCloseOp$$anon$2.<init>(ClosableIterator.scala:71)
at com.databricks.sql.transaction.tahoe.storage.ClosableIterator$IteratorFlatMapCloseOp$.flatMapWithClose$extension(ClosableIterator.scala:68)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.filterAndIndexDeltaLogs$1(DeltaSource.scala:828)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.$anonfun$getFileChanges$5(DeltaSource.scala:861)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:583)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.getFileChanges(DeltaSource.scala:854)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.getFileChangesWithRateLimit(DeltaSource.scala:305)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.getFileChangesWithRateLimit$(DeltaSource.scala:292)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.getFileChangesWithRateLimit(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.getNextOffsetFromPreviousOffset(DeltaSource.scala:469)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.getNextOffsetFromPreviousOffset$(DeltaSource.scala:453)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.com$databricks$sql$transaction$tahoe$sources$DeltaSourceEdge$$super$getNextOffsetFromPreviousOffset(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceEdge.getNextOffsetFromPreviousOffset(DeltaSourceEdge.scala:620)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceEdge.getNextOffsetFromPreviousOffset$(DeltaSourceEdge.scala:614)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.getNextOffsetFromPreviousOffset(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.$anonfun$latestOffsetInternal$1(DeltaSource.scala:1005)
at scala.Option.map(Option.scala:230)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.latestOffsetInternal(DeltaSource.scala:1005)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.initLastOffsetForTriggerAvailableNow(DeltaSource.scala:275)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.initLastOffsetForTriggerAvailableNow$(DeltaSource.scala:273)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.com$databricks$sql$transaction$tahoe$sources$DeltaSourceEdge$$super$initLastOffsetForTriggerAvailableNow(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceEdge.initLastOffsetForTriggerAvailableNow(DeltaSourceEdge.scala:846)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceEdge.initLastOffsetForTriggerAvailableNow$(DeltaSourceEdge.scala:844)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.initLastOffsetForTriggerAvailableNow(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.initForTriggerAvailableNowIfNeeded(DeltaSource.scala:269)
at com.databricks.sql.transaction.tahoe.sources.DeltaSourceBase.initForTriggerAvailableNowIfNeeded$(DeltaSource.scala:265)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.initForTriggerAvailableNowIfNeeded(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.$anonfun$latestOffset$1(DeltaSource.scala:997)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag(DeltaLogging.scala:325)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag$(DeltaLogging.scala:312)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.withOperationTypeTag(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$2(DeltaLogging.scala:178)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:418)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:416)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.recordFrameProfile(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$1(DeltaLogging.scala:177)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:508)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:613)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:636)
at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:49)
at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:295)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:291)
at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:47)
at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:44)
at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:30)
at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:96)
at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:77)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:30)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:608)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:517)
at com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:30)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:509)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:475)
at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:30)
at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:87)
at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:173)
at com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:78)
at com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:65)
at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:132)
at com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:537)
at com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:516)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.recordOperation(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperationInternal(DeltaLogging.scala:176)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:166)
at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:155)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.recordDeltaOperation(DeltaSource.scala:751)
at com.databricks.sql.transaction.tahoe.sources.DeltaSource.latestOffset(DeltaSource.scala:995)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$constructNextBatch$4(MicroBatchExecution.scala:1091)
at org.apache.spark.sql.execution.streaming.ProgressContext.reportTimeTaken(ProgressReporter.scala:328)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$constructNextBatch$2(MicroBatchExecution.scala:1089)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$constructNextBatch$1(MicroBatchExecution.scala:1072)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:1877)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.constructNextBatch(MicroBatchExecution.scala:1068)
at org.apache.spark.sql.execution.streaming.MultiBatchRollbackSupport.constructNextBatchWithRollbackHandling(MultiBatchRollbackSupport.scala:144)
at org.apache.spark.sql.execution.streaming.MultiBatchRollbackSupport.constructNextBatchWithRollbackHandling$(MultiBatchRollbackSupport.scala:132)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.constructNextBatchWithRollbackHandling(MicroBatchExecution.scala:78)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$executeOneBatch$4(MicroBatchExecution.scala:726)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.execution.streaming.ProgressContext.reportTimeTaken(ProgressReporter.scala:328)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$executeOneBatch$3(MicroBatchExecution.scala:705)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:49)
at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:295)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:291)
at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:47)
at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:44)
at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:30)
at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:96)
at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:77)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:30)
at com.databricks.spark.util.PublicDBLogging.withAttributionTags0(DatabricksSparkUsageLogger.scala:91)
at com.databricks.spark.util.DatabricksSparkUsageLogger.withAttributionTags(DatabricksSparkUsageLogger.scala:195)
at com.databricks.spark.util.UsageLogging.$anonfun$withAttributionTags$1(UsageLogger.scala:668)
at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:780)
at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:789)
at com.databricks.spark.util.UsageLogging.withAttributionTags(UsageLogger.scala:668)
at com.databricks.spark.util.UsageLogging.withAttributionTags$(UsageLogger.scala:666)
at org.apache.spark.sql.execution.streaming.StreamExecution.withAttributionTags(StreamExecution.scala:87)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.executeOneBatch(MicroBatchExecution.scala:699)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStreamWithListener$1(MicroBatchExecution.scala:660)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStreamWithListener$1$adapted(MicroBatchExecution.scala:660)
at org.apache.spark.sql.execution.streaming.ConcurrentExecutor.$anonfun$runOneBatch$4(TriggerExecutor.scala:675)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:157)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.IdentityClaim$.withClaim(IdentityClaim.scala:48)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.$anonfun$runWithCaptured$4(SparkThreadLocalForwardingThreadPoolExecutor.scala:113)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:112)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:89)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:154)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:157)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:840)
Wednesday
@tenzinpro have you looked into the documentation Incremental-refreshes for materalised views with streaming tables yet? https://docs.databricks.com/aws/en/optimizations/incremental-refresh .. there's a section in there which jumped out to me:
I'd suggest reading a little more in the article to see if anything you're doing is violating an incremental-refresh requirement
All the best,
BS
Friday
Hi @tenzinpro ,
This is an expected error. "DELTA_SOURCE_TABLE_IGNORE_CHANGES] Detected a data update"
As explained in the error: This is currently not supported. If this is going to happen regularly and you are okay to skip changes, set the option 'skipChangeCommits' to 'true'. If you would like the data update to be reflected, please restart this query with a fresh checkpoint directory or do a full refresh if you are using DLT. If you need to handle these changes, please switch to MV.
I recommended below options to the customer to achieve their use case:
1. Define a Materialized View (MV) instead of Streaming table (ST) or
2. Use skipChangeCommits to skip the changes happened on the source like updates/deletes or
3. Use the source with CDF and follow the steps here:
https://community.databricks.com/t5/technical-blog/propagating-deletes-managing-data-removal-using-d...
And Incremental refresh for materialized views as suggested by @BS_THE_ANALYST is the best way.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now