cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Upgraded cluster to 16.1/16.2 and upload data(append) to elastic index is failling

nishg
New Contributor II

I have updated compute cluster to both databricks version 16.1 and 16.2 and run the workflow to append data into elastic index but it started failing with below error. The same job is working fine with databricks version 15.  Let me know if anyone come across with this issue post updating databricks version to 16.1/16.2

 

Path must be absolute: myindex/_delta_log JVM stacktrace: java.lang.IllegalArgumentException at com.databricks.common.path.AbstractPath$.fromHadoopPath(AbstractPath.scala:114) at com.databricks.backend.daemon.data.filesystem.MountEntryResolver.resolve(MountEntryResolver.scala:52) at com.databricks.backend.daemon.data.client.DBFSOnUCFileSystemResolverImpl.resolveWithSam(DBFSOnUCFileSystemResolverImpl.scala:145) at com.databricks.backend.daemon.data.client.DBFSOnUCFileSystemResolverImpl.resolveAndGetFileSystem(DBFSOnUCFileSystemResolverImpl.scala:195) at com.databricks.backend.daemon.data.client.DBFSV2.resolveAndGetFileSystem(DatabricksFileSystemV2.scala:143) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.resolve(DatabricksFileSystemV2.scala:771) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.$anonfun$getFileStatus$2(DatabricksFileSystemV2.scala:1184) at com.databricks.s3a.S3AExceptionUtils$.convertAWSExceptionToJavaIOException(DatabricksStreamUtils.scala:64) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.$anonfun$getFileStatus$1(DatabricksFileSystemV2.scala:1183) at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:508) at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:613) at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:636) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:49) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:295) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:291) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:47) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:44) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionContext(DatabricksFileSystemV2.scala:739) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:96) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:77) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionTags(DatabricksFileSystemV2.scala:739) at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:608) at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:517) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperationWithResultTags(DatabricksFileSystemV2.scala:739) at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:509) at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:475) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperation(DatabricksFileSystemV2.scala:739) at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.getFileStatus(DatabricksFileSystemV2.scala:1182) at com.databricks.backend.daemon.data.client.DatabricksFileSystem.getFileStatus(DatabricksFileSystem.scala:211) at com.databricks.common.filesystem.LokiFileSystem.getFileStatus(LokiFileSystem.scala:313) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1862) at com.databricks.sql.transaction.tahoe.DeltaTableUtils$.findDeltaTableRootThrowOnError(DeltaTable.scala:339) at com.databricks.sql.transaction.tahoe.DeltaTableUtils$.findDeltaTableRoot(DeltaTable.scala:287) at com.databricks.sql.transaction.tahoe.DeltaTableUtils$.findDeltaTableRoot(DeltaTable.scala:278) at com.databricks.sql.transaction.tahoe.DeltaTableUtils$.findDeltaTableRoot(DeltaTable.scala:270) at com.databricks.sql.transaction.tahoe.DeltaValidation$.validateNonDeltaWrite(DeltaValidation.scala:200) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:306) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:273) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleWriteOperation(SparkConnectPlanner.scala:3540) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:3047) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:402) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:289) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:210) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:399) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1384) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:399) at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:97) at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:90) at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:240) at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:89) at org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:398) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:210) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:129) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.$anonfun$run$2(ExecuteThreadRunner.scala:628) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104) at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109) at scala.util.Using$.resource(Using.scala:269) at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:628)

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

Your error is a known issue appearing after upgrading Databricks clusters to versions 16.1 and 16.2, specifically when running workflows to append data into an Elasticsearch index. This errorโ€”"Path must be absolute: myindex/_delta_log"โ€”indicates a change or stricter enforcement in path validation for file access, likely surfaced by enhancements or refactoring in the way Databricks handles filesystem paths in recent runtimes. Many users have reported the same issue after upgrading, noting that the workflow runs fine in Databricks version 15 but breaks in 16.1 and 16.2.โ€‹

Why the Error Occurs

  • Recent Databricks runtime updates require strictly absolute paths when interacting with Delta Lake, DBFS, or external storage, whereas earlier versions could resolve some relative paths.โ€‹

  • The stack trace confirms the error arises in the internal path resolver, suggesting a stricter check for absolute paths in Delta Table operations introduced in the new runtime.

  • If mounting or direct access is used (e.g., myindex/_delta_log), it must be fully qualified to follow the required absolute path pattern (e.g., /dbfs/myindex/_delta_log or /mnt/myindex/_delta_log, depending on your workspace setup).โ€‹

  • The specific error often appears on Windows environments (and sometimes on macOS/Linux, depending on the code), caused by path separators and non-absolute references being incompatible with the strict validation.โ€‹

Current Workarounds and Best Practices

  • Specify full DBFS, S3, Azure Blob, or mounted absolute paths for file interactions. For example, using /dbfs/mnt/myindex/_delta_log instead of myindex/_delta_log.โ€‹

  • Check your code, configs, and job parameters for any relative path usage. Update such references to be absolute paths.

  • If you encounter this issue only after the runtime upgrade, ensure your integration and connector libraries are updated accordingly. Some jobs may require connector updates (like the Spark Elasticsearch connector) to remain compatible with the new runtimeโ€™s stricter validation.โ€‹

Databricks Community Feedback

  • Multiple users have posted about facing this exact issue in the Databricks Community after upgrading to 16.1/16.2.โ€‹

  • So far, the consensus is that updating all relevant file path references to absolute format resolves the error, though users are seeking Databricksโ€™ official patch or hotfix for legacy code compatibility.

Action Steps

  1. Audit all your workflow, notebook, and Spark job file references for relative paths.

  2. Refactor all path references for tables and indexes so they begin with /dbfs/, /mnt/, or the correct absolute path prefix.

  3. Validate integration library (connector) versions against Databricks 16.1/16.2 compatibility requirements.

If your workflow needs to be compatible with both Databricks version 15 and 16.x, you might consider abstracting paths via configuration files, so you can toggle between the two styles according to runtime, or apply a path normalization utility in your script.โ€‹

This issue is actively discussed and may have further developments, so checking recent release notes and the Databricks community forum is recommended for updates or new fixes.โ€‹