ā08-21-2024 11:02 AM
DBR 14.3, Spark 3.5.0. We use AWS Glue Metastore.
On August 20th some of our pipelines started timing out during write to a Delta table. We're experiencing many hours of driver executing post commit hooks. We write dataframes to delta with `mode=overwrite`, `mergeSchema=true`, `replaceWhere=<day partition>`
Adding `DO_NOT_UPDATE_STATS=true` to table properties didn't help. Adding `spark.databricks.hive.stats.autogather': 'false', 'spark.hadoop.hive.stats.autogather': 'false'` to options didn't help either.
I opened the Driver's Thread dump and observed a curious stack trace (attached below)
# Question 1: why `createTable` gets invoked by `updateCatalog`?
app//com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTable(ManagedCatalogSessionCatalog.scala:763)
app//com.databricks.sql.DatabricksSessionCatalog.createTable(DatabricksSessionCatalog.scala:233)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.updateCatalog(CreateDeltaTableCommand.scala:873)
This open search implementation of updateCatalog creates table only if it doesn't exist, but our table does exist.
# Question 2: updateTableStatsFast takes all time and scans the whole table
com.amazonaws.glue.shims.AwsGlueSparkHiveShims.updateTableStatsFast(AwsGlueSparkHiveShims.java:62)
com.amazonaws.glue.catalog.metastore.GlueMetastoreClientDelegate.alterTable(GlueMetastoreClientDelegate.java:444)
How do I opt out from updating Glue stats? They are mostly useless but in this particular case it causes full listing of the whole Delta table on S3 with every write.
# Observed stack trace
java.base@17.0.12/jdk.internal.misc.Unsafe.park(Native Method) - waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6fdb721
java.base@17.0.12/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
java.base@17.0.12/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:506)
java.base@17.0.12/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465)
java.base@17.0.12/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436)
java.base@17.0.12/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1625)
app//org.spark_project.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1323)
app//org.spark_project.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:306)
app//org.spark_project.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:223)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool.super$borrowObject(LocalHiveClientImpl.scala:131)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool.$anonfun$borrowObject$1(LocalHiveClientImpl.scala:131)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool$$Lambda$5925/0x00007fbcc5678f58.apply(Unknown Source)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:411)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:397)
app//com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34)
app//org.apache.spark.sql.hive.client.LocalHiveClientsPool.borrowObject(LocalHiveClientImpl.scala:129)
app//org.apache.spark.sql.hive.client.PoolingHiveClient.retain(PoolingHiveClient.scala:181)
app//org.apache.spark.sql.hive.HiveExternalCatalog.maybeSynchronized(HiveExternalCatalog.scala:114)
app//org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$1(HiveExternalCatalog.scala:154)
app//org.apache.spark.sql.hive.HiveExternalCatalog$$Lambda$5854/0x00007fbcc5655bc0.apply(Unknown Source)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:411)
app//com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:397)
app//com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34)
app//org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:153)
app//org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:333)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.$anonfun$databaseExists$1(ExternalCatalogWithListener.scala:93)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener$$Lambda$7640/0x00007fbcc5ae4868.apply$mcZ$sp(Unknown Source)
app//scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
app//org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:984)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.$anonfun$profile$1(ExternalCatalogWithListener.scala:54)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener$$Lambda$7544/0x00007fbcc5ab4478.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.profile(ExternalCatalogWithListener.scala:53)
app//org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.databaseExists(ExternalCatalogWithListener.scala:93)
app//org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.databaseExists(SessionCatalog.scala:837)
app//org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.requireDbExists(SessionCatalog.scala:766)
app//org.apache.spark.sql.catalyst.catalog.SessionCatalogImpl.createTable(SessionCatalog.scala:930)
app//com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTableInternal(ManagedCatalogSessionCatalog.scala:802)
app//com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTable(ManagedCatalogSessionCatalog.scala:763)
app//com.databricks.sql.DatabricksSessionCatalog.createTable(DatabricksSessionCatalog.scala:233)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.updateCatalog(CreateDeltaTableCommand.scala:873)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.runPostCommitUpdates(CreateDeltaTableCommand.scala:279)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.handleCommit(CreateDeltaTableCommand.scala:259)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.$anonfun$run$2(CreateDeltaTableCommand.scala:169)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand$$Lambda$8751/0x00007fbcc5c96140.apply(Unknown Source)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag(DeltaLogging.scala:225)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag$(DeltaLogging.scala:212)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.withOperationTypeTag(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$2(DeltaLogging.scala:164)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging$$Lambda$8219/0x00007fbcc5bd4c78.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordFrameProfile(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$1(DeltaLogging.scala:163)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging$$Lambda$8217/0x00007fbcc5bd46d8.apply(Unknown Source)
app//com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:573)
app//com.databricks.logging.UsageLogging$$Lambda$681/0x00007fbcc3eb28e8.apply(Unknown Source)
app//com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:669)
app//com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:687)
app//com.databricks.logging.UsageLogging$$Lambda$684/0x00007fbcc3eb3158.apply(Unknown Source)
app//com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)
app//com.databricks.logging.UsageLogging$$Lambda$591/0x00007fbcc3e53418.apply(Unknown Source)
app//scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
app//com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:216)
app//com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)
app//com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
app//com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:472)
app//com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:455)
app//com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:664)
app//com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:582)
app//com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:573)
app//com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:542)
app//com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:27)
app//com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:68)
app//com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:150)
app//com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:68)
app//com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:55)
app//com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:109)
app//com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:429)
app//com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:408)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordOperation(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperationInternal(DeltaLogging.scala:162)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:152)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:142)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordDeltaOperation(CreateDeltaTableCommand.scala:70)
app//com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.run(CreateDeltaTableCommand.scala:148)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.$anonfun$createDeltaTable$1(DeltaCatalog.scala:335)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$$Lambda$8054/0x00007fbcc5b93988.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.recordFrameProfile(DeltaCatalog.scala:117)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.com$databricks$sql$transaction$tahoe$catalog$DeltaCatalog$$createDeltaTable(DeltaCatalog.scala:158)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$StagedDeltaTableV2.$anonfun$commitStagedChanges$1(DeltaCatalog.scala:1130)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$StagedDeltaTableV2$$Lambda$8051/0x00007fbcc5b92aa0.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294)
app//com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog.recordFrameProfile(DeltaCatalog.scala:117)
app//com.databricks.sql.transaction.tahoe.catalog.DeltaCatalog$StagedDeltaTableV2.commitStagedChanges(DeltaCatalog.scala:1089)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.$anonfun$writeToTable$2(WriteToDataSourceV2Exec.scala:674)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec$$Lambda$7718/0x00007fbcc5b11b18.apply(Unknown Source)
app//org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1546)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.$anonfun$writeToTable$1(WriteToDataSourceV2Exec.scala:661)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec$$Lambda$7717/0x00007fbcc5b11848.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.writeToTable(WriteToDataSourceV2Exec.scala:679)
app//org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.writeToTable$(WriteToDataSourceV2Exec.scala:655)
app//org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec.writeToTable(WriteToDataSourceV2Exec.scala:210)
app//org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec.run(WriteToDataSourceV2Exec.scala:268)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.$anonfun$result$2(V2CommandExec.scala:48)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec$$Lambda$7696/0x00007fbcc5affbe0.apply(Unknown Source)
app//org.apache.spark.sql.execution.SparkPlan.runCommandWithAetherOff(SparkPlan.scala:178)
app//org.apache.spark.sql.execution.SparkPlan.runCommandInAetherOrSpark(SparkPlan.scala:189)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.$anonfun$result$1(V2CommandExec.scala:48)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec$$Lambda$7695/0x00007fbcc5aff910.apply(Unknown Source)
app//com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:47) - locked org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec@21108331
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:45)
app//org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:56)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$4(QueryExecution.scala:358)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5795/0x00007fbcc563ea80.apply(Unknown Source)
app//org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:166)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$3(QueryExecution.scala:358)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5226/0x00007fbcc5472800.apply(Unknown Source)
app//org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$9(SQLExecution.scala:392)
app//org.apache.spark.sql.execution.SQLExecution$$$Lambda$5239/0x00007fbcc54785b8.apply(Unknown Source)
app//org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:700)
app//org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:277)
app//org.apache.spark.sql.execution.SQLExecution$$$Lambda$5228/0x00007fbcc5472da0.apply(Unknown Source)
app//org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1175)
app//org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId0(SQLExecution.scala:164)
app//org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:637)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:357)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5225/0x00007fbcc5472530.apply(Unknown Source)
app//org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:1103)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:353)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1$$Lambda$5224/0x00007fbcc54703f0.apply(Unknown Source)
app//org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:312)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:350)
app//org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:334)
app//org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:505)
app//org.apache.spark.sql.catalyst.trees.TreeNode$$Lambda$3659/0x00007fbcc4def7d8.apply(Unknown Source)
app//org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83)
app//org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:505)
app//org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:39)
app//org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:343)
app//org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:339)
app//org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:39)
app//org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:39)
app//org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:481)
app//org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:334)
app//org.apache.spark.sql.execution.QueryExecution$$Lambda$3864/0x00007fbcc4eec870.apply(Unknown Source)
app//org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:400)
app//org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:334)
app//org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:271) - locked org.apache.spark.sql.execution.QueryExecution@febcbf6
app//org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:268)
app//org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:429)
app//org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:1040)
app//org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:746)
app//org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:677)
java.base@17.0.12/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
java.base@17.0.12/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
java.base@17.0.12/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.base@17.0.12/java.lang.reflect.Method.invoke(Method.java:569)
app//py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
app//py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
app//py4j.Gateway.invoke(Gateway.java:306)
app//py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
app//py4j.commands.CallCommand.execute(CallCommand.java:79)
app//py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
app//py4j.ClientServerConnection.run(ClientServerConnection.java:119)
java.base@17.0.12/java.lang.Thread.run(Thread.java:840)
ā08-28-2024 04:16 AM
spark.databricks.delta.catalog.update.enabled=true setting helped but I still don't understand why the problem started to occur.
ā08-21-2024 11:35 AM
ā08-21-2024 12:07 PM
We also observe that `METASTORE_DOWN` even correlates with following logs in `log4j`. (all <redacted_value>s are unique)
```
24/08/21 12:27:02 INFO GenerateSymlinkManifest: Generated manifest partitions for s3://constructor-analytics-data/tables/delta_prod/query_item_pairs_from_qrl [379]:
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
day=2024-07-01/ac_key=<redacted_value>
```
ā08-28-2024 04:16 AM
spark.databricks.delta.catalog.update.enabled=true setting helped but I still don't understand why the problem started to occur.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonāt want to miss the chance to attend and share knowledge.
If there isnāt a group near you, start one and help create a community that brings people together.
Request a New Group