cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 

Delta table takes too long to write due to S3 full scan

ivanychev
Contributor II

DBR 14.3, Spark 3.5.0. We use AWS Glue Metastore.

On August 20th some of our pipelines started timing out during write to a Delta table. We're experiencing many hours of driver executing post commit hooks. We write dataframes to delta with `mode=overwrite`, `mergeSchema=true`, `replaceWhere=<day partition>`

Adding `DO_NOT_UPDATE_STATS=true` to table properties didn't help. Adding `spark.databricks.hive.stats.autogather': 'false', 'spark.hadoop.hive.stats.autogather': 'false'` to options didn't help either.

I opened the Driver's Thread dump and observed a curious stack trace (attached below)

# Question 1: why `createTable` gets invoked by `updateCatalog`?

app//com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTable(ManagedCatalogSessionCatalog.scala:763) 
app//com.databricks.sql.DatabricksSessionCatalog.createTable(DatabricksSessionCatalog.scala:233)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.updateCatalog(CreateDeltaTableCommand.scala:873)

This open search implementation of updateCatalog creates table only if it doesn't exist, but our table does exist.

# Question 2: updateTableStatsFast takes all time and scans the whole table

 

com.amazonaws.glue.shims.AwsGlueSparkHiveShims.updateTableStatsFast(AwsGlueSparkHiveShims.java:62) 
com.amazonaws.glue.catalog.metastore.GlueMetastoreClientDelegate.alterTable(GlueMetastoreClientDelegate.java:444)

How do I opt out from updating Glue stats? They are mostly useless but in this particular case it causes full listing of the whole Delta table on S3 with every write.


# Observed stack trace

 

java.base@17.0.12/jdk.internal.misc.Unsafe.park(Native Method) - waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6fdb721 
java.base@17.0.12/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
java.base@17.0.12/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:506)
java.base@17.0.12/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465)
java.base@17.0.12/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436)
java.base@17.0.12/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1625)
app//org.spark_project.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
app//org.spark_project.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:306)
app//org.spark_project.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:223)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool.super$borrowObject(LocalHiveClientImpl.scala:131)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool.$anonfun$borrowObject$1(LocalHiveClientImpl.scala:131)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool$$Lambda$5925/0x00007fbcc5678f58.apply(Unknown Source)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:411)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:397)
app//com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool.borrowObject(LocalHiveClientImpl.scala:129)
app//org.apache.spark.sql.hive.client.PoolingHiveClient.retain(PoolingHiveClient.scala:181)
app//org.apache.spark.sql.hive.HiveExternalCatalog.maybeSynchronized(HiveExternalCatalog.scala:114)
app//org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$1(HiveExternalCatalog.scala:154)
app//org.apache.spark.sql.hive.HiveExternalCatalog$$Lambda$5854/0x00007fbcc5655bc0.apply(Unknown Source)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:411)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:397)
app//com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34)
app//org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:153)
app//org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:333)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.$anonfun$databaseExists$1(ExternalCatalogWithListener.scala:93)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener$$Lambda$7640/0x00007fbcc5ae4868.apply$mcZ$sp(Unknown Source)
app//scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
app//org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:984)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.$anonfun$profile$1(ExternalCatalogWithListener.scala:54)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener$$Lambda$7544/0x00007fbcc5ab4478.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.profile(ExternalCatalogWithListener.scala:53)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.databaseExists(ExternalCatalogWithListener.scala:93)
app//org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.databaseExists(SessionCatalog.scala:837)
app//org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.requireDbExists(SessionCatalog.scala:766)
app//org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.createTable(SessionCatalog.scala:930)
app//com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTableInternal(ManagedCatalogSessionCatalog.scala:802)
app//com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTable(ManagedCatalogSessionCatalog.scala:763)
app//com.databricks.sql.DatabricksSessionCatalog.createTable(DatabricksSessionCatalog.scala:233)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.updateCatalog(CreateDeltaTableCommand.scala:873)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.runPostCommitUpdates(CreateDeltaTableCommand.scala:279)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.handleCommit(CreateDeltaTableCommand.scala:259)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.$anonfun$run$2(CreateDeltaTableCommand.scala:169)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand$$Lambda$8751/0x00007fbcc5c96140.apply(Unknown Source)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag(DeltaLogging.scala:225)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag$(DeltaLogging.scala:212)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.withOperationTypeTag(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$2(DeltaLogging.scala:164)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging$$Lambda$8219/0x00007fbcc5bd4c78.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordFrameProfile(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$1(DeltaLogging.scala:163)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging$$Lambda$8217/0x00007fbcc5bd46d8.apply(Unknown Source)
app//com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:573)
app//com.databricks.logging.UsageLogging$$Lambda$681/0x00007fbcc3eb28e8.apply(Unknown Source)
app//com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:669)
app//com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:687)
app//com.databricks.logging.UsageLogging$$Lambda$684/0x00007fbcc3eb3158.apply(Unknown Source)
app//com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)
app//com.databricks.logging.UsageLogging$$Lambda$591/0x00007fbcc3e53418.apply(Unknown Source)
app//scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
app//com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:216)
app//com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)
app//com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
app//com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:472)
app//com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:455)
app//com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:664)
app//com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:582)
app//com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:573)
app//com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:542)
app//com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:68)
app//com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:150)
app//com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:68)
app//com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:55)
app//com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:109)
app//com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:429)
app//com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:408)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordOperation(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperationInternal(DeltaLogging.scala:162)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:152)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:142)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordDeltaOperation(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.run(CreateDeltaTableCommand.scala:148)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.$anonfun$createDeltaTable$1(DeltaCatalog.scala:335)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$$Lambda$8054/0x00007fbcc5b93988.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.recordFrameProfile(DeltaCatalog.scala:117)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.com$databricks$sql$transaction$tahoe$catalog$DeltaCatalog$$createDeltaTable(DeltaCatalog.scala:158)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$StagedDeltaTableV2.$anonfun$commitStagedChanges$1(DeltaCatalog.scala:1130)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$StagedDeltaTableV2$$Lambda$8051/0x00007fbcc5b92aa0.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.recordFrameProfile(DeltaCatalog.scala:117)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$StagedDeltaTableV2.commitStagedChanges(DeltaCatalog.scala:1089)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.$anonfun$writeToTable$2(WriteToDataSourceV2Exec.scala:674)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec$$Lambda$7718/0x00007fbcc5b11b18.apply(Unknown Source)
app//org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1546)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.$anonfun$writeToTable$1(WriteToDataSourceV2Exec.scala:661)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec$$Lambda$7717/0x00007fbcc5b11848.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.writeToTable(WriteToDataSourceV2Exec.scala:679)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.writeToTable$(WriteToDataSourceV2Exec.scala:655)
app//org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec.writeToTable(WriteToDataSourceV2Exec.scala:210)
app//org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec.run(WriteToDataSourceV2Exec.scala:268)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.$anonfun$result$2(V2CommandExec.scala:48)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec$$Lambda$7696/0x00007fbcc5affbe0.apply(Unknown Source)
app//org.apache.spark.sql.execution.SparkPlan.runCommandWithAetherOff(SparkPlan.scala:178)
app//org.apache.spark.sql.execution.SparkPlan.runCommandInAetherOrSpark(SparkPlan.scala:189)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.$anonfun$result$1(V2CommandExec.scala:48)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec$$Lambda$7695/0x00007fbcc5aff910.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:47) - locked org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec@21108331
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:45)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:56)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$4(QueryExecution.scala:358)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5795/0x00007fbcc563ea80.apply(Unknown Source)
app//org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:166)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$3(QueryExecution.scala:358)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5226/0x00007fbcc5472800.apply(Unknown Source)
app//org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$9(SQLExecution.scala:392)
app//org.apache.spark.sql.execution.SQLExecution$$$Lambda$5239/0x00007fbcc54785b8.apply(Unknown Source)
app//org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:700)
app//org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:277)
app//org.apache.spark.sql.execution.SQLExecution$$$Lambda$5228/0x00007fbcc5472da0.apply(Unknown Source)
app//org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1175)
app//org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId0(SQLExecution.scala:164)
app//org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:637)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:357)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5225/0x00007fbcc5472530.apply(Unknown Source)
app//org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:1103)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:353)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5224/0x00007fbcc54703f0.apply(Unknown Source)
app//org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:312)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:350)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:334)
app//org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:505)
app//org.apache.spark.sql.catalyst.trees.TreeNode$$Lambda$3659/0x00007fbcc4def7d8.apply(Unknown Source)
app//org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83)
app//org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:505)
app//org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:39)
app//org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:343)
app//org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:339)
app//org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:39)
app//org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:39)
app//org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:481)
app//org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:334)
app//org.apache.spark.sql.execution.QueryExecution$$Lambda$3864/0x00007fbcc4eec870.apply(Unknown Source)
app//org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:400)
app//org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:334)
app//org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:271) - locked org.apache.spark.sql.execution.QueryExecution@febcbf6
app//org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:268)
app//org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:429)
app//org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:1040)
app//org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:746)
app//org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:677)
java.base@17.0.12/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
java.base@17.0.12/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
java.base@17.0.12/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.base@17.0.12/java.lang.reflect.Method.invoke(Method.java:569)
app//py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
app//py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
app//py4j.Gateway.invoke(Gateway.java:306)
app//py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
app//py4j.commands.CallCommand.execute(CallCommand.java:79)
app//py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
app//py4j.ClientServerConnection.run(ClientServerConnection.java:119)
java.base@17.0.12/java.lang.Thread.run(Thread.java:840)

 

Sergey
1 ACCEPTED SOLUTION

Accepted Solutions

ivanychev
Contributor II

spark.databricks.delta.catalog.update.enabled=true setting helped but I still don't understand why the problem started to occur.

https://docs.databricks.com/en/archive/external-metastores/external-hive-metastore.html#external-apa...

Sergey

View solution in original post

3 REPLIES 3

ivanychev
Contributor II

I also observe that both pipelines have "Metastore is down." events. However, logs contain no stacktraces describing what it actually means

Sergey

ivanychev
Contributor II

We also observe that `METASTORE_DOWN` even correlates with following logs in `log4j`. (all <redacted_value>s are unique)

```
24/08/21 12:27:02 INFO GenerateSymlinkManifest: Generated manifest partitions for s3://constructor-analytics-data/tables/delta_prod/query_item_pairs_from_qrl [379]:
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
```

Sergey

ivanychev
Contributor II

spark.databricks.delta.catalog.update.enabled=true setting helped but I still don't understand why the problem started to occur.

https://docs.databricks.com/en/archive/external-metastores/external-hive-metastore.html#external-apa...

Sergey

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonā€™t want to miss the chance to attend and share knowledge.

If there isnā€™t a group near you, start one and help create a community that brings people together.

Request a New Group