How to specify path while creating tables using DLT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-23-2025 01:30 AM
Hi All,
I am trying to create table using DLT and would like to specify the path where all the files should reside.
I am trying something like this:
dlt.create_streaming_table(
name="test",
schema="""product_id STRING NOT NULL PRIMARY KEY,
sap_key_test STRING NOT NULL,
org_name_xyz STRING NOT NULL,
brand STRING NOT NULL,
country STRING NOT NULL,
created TIMESTAMP NOT NULL,
last_updated TIMESTAMP NOT NULL
""",
cluster_by=["brand", "country"],
table_properties={
"quality": "silver",
"delta.feature.timestampNtz": "supported",
},
comment="product_hierarchy table in silver layer",
path='s3://bi-rootbucket-dev/analytics/silver/test/',
)
#ignore_null_updates set to True to prevent overwriting product_id and created columns
dlt.apply_changes(
source="webshop_cdc",
target="test",
keys=["product_id","sales_org_test"],
sequence_by=F.col("synced"),
except_column_list=[
"synced",
"record_deleted",
"product_online"
],
stored_as_scd_type=2,
#track_history_column_list = ["product_id","sales_org_test"],
#ignore_null_updates=True,
apply_as_deletes=F.expr("record_deleted= true"),
)
but while running the DLT pipeline, I am getting this error:
java.lang.IllegalArgumentException: Cannot specify an explicit path for a table when using Unity Catalog. Remove the explicit path: s3://bi-rootbucket-dev/analytics/silver/test/ set for table `test`.
at com.databricks.pipelines.execution.service.UpdateValidation$.$anonfun$validateExplicitPathNotSetOnUCTable$2(UpdateValidation.scala:557)
at com.databricks.pipelines.execution.service.UpdateValidation$.$anonfun$validateExplicitPathNotSetOnUCTable$2$adapted(UpdateValidation.scala:537)
at scala.Option.foreach(Option.scala:407)
at com.databricks.pipelines.execution.service.UpdateValidation$.$anonfun$validateExplicitPathNotSetOnUCTable$1(UpdateValidation.scala:537)
at com.databricks.pipelines.execution.service.UpdateValidation$.$anonfun$validateExplicitPathNotSetOnUCTable$1$adapted(UpdateValidation.scala:535)
at scala.collection.immutable.List.foreach(List.scala:431)
at com.databricks.pipelines.execution.service.UpdateValidation$.validateExplicitPathNotSetOnUCTable(UpdateValidation.scala:535)
at com.databricks.pipelines.execution.service.UpdateValidation$.validate(UpdateValidation.scala:76)
at com.databricks.pipelines.execution.service.DLTComputeUpdateContext.validate(DLTComputeUpdateContext.scala:114)
at com.databricks.pipelines.execution.core.UpdateExecution.initializationForUpdates(UpdateExecution.scala:1104)
at com.databricks.pipelines.execution.core.UpdateExecution.$anonfun$executeUpdate$2(UpdateExecution.scala:626)
at com.databricks.pipelines.execution.core.UpdateExecution.$anonfun$executeStage$1(UpdateExecution.scala:471)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$3(DeltaPipelinesUsageLogging.scala:123)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter.executeWithPeriodicReporting(OperationStatusReporter.scala:120)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter$.executeWithPeriodicReporting(OperationStatusReporter.scala:160)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$6(DeltaPipelinesUsageLogging.scala:143)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:528)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:633)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:656)
at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48)
at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:276)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:272)
at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46)
at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.withAttributionContext(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95)
at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.withAttributionTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:628)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:537)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.recordOperationWithResultTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:529)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:495)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.recordOperation(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.recordOperation0(DeltaPipelinesUsageLogging.scala:67)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$1(DeltaPipelinesUsageLogging.scala:135)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation(DeltaPipelinesUsageLogging.scala:113)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation$(DeltaPipelinesUsageLogging.scala:109)
at com.databricks.pipelines.execution.core.UpdateExecution.recordPipelinesOperation(UpdateExecution.scala:92)
at com.databricks.pipelines.execution.core.UpdateExecution.executeStage(UpdateExecution.scala:471)
at com.databricks.pipelines.execution.core.UpdateExecution.$anonfun$executeUpdate$1(UpdateExecution.scala:624)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94)
at com.databricks.pipelines.execution.core.UpdateExecution.executeUpdate(UpdateExecution.scala:624)
at com.databricks.pipelines.execution.core.UpdateExecution.$anonfun$start$3(UpdateExecution.scala:304)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48)
at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:276)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:272)
at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46)
at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.withAttributionContext(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95)
at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76)
at com.databricks.pipelines.execution.core.monitoring.PublicLogging.withAttributionTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging$$anon$1.runWithAttributionTags(DeltaPipelinesUsageLogging.scala:85)
at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.withDbAttributionTags(DeltaPipelinesUsageLogging.scala:92)
at com.databricks.pipelines.execution.core.monitoring.DeltaPipelinesUsageLogging.withDbAttributionTags$(DeltaPipelinesUsageLogging.scala:91)
at com.databricks.pipelines.execution.core.UpdateExecution.withDbAttributionTags(UpdateExecution.scala:92)
at com.databricks.pipelines.execution.core.UpdateExecution.$anonfun$start$1(UpdateExecution.scala:252)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.pipelines.execution.core.BaseUCContext.$anonfun$runWithNewUCS$1(BaseUCContext.scala:1462)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108)
at com.databricks.pipelines.execution.core.BaseUCContext.runWithNewUCS(BaseUCContext.scala:1459)
at com.databricks.pipelines.execution.core.UCContextCompanion$OptionUCContextHelper.runWithNewUCSIfAvailable(BaseUCContext.scala:3165)
at com.databricks.pipelines.execution.core.UpdateExecution.start(UpdateExecution.scala:239)
at com.databricks.pipelines.execution.service.ExecutionBackend$$anon$2.$anonfun$run$2(ExecutionBackend.scala:821)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.pipelines.execution.core.CommandContextUtils$.withCommandContext(CommandContextUtils.scala:99)
at com.databricks.pipelines.execution.service.ExecutionBackend$$anon$2.run(ExecutionBackend.scala:816)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:157)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.IdentityClaim$.withClaim(IdentityClaim.scala:48)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.$anonfun$runWithCaptured$4(SparkThreadLocalForwardingThreadPoolExecutor.scala:113)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:112)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:89)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:154)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:157)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
How can I specify path while creating tables in DLT? Is there any approach which I could take here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-25-2025 06:50 PM
I think you need to create an external volume to the file location, and then use the volume path to access the files directly. https://docs.databricks.com/en/volumes/managed-vs-external.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-27-2025 11:31 PM
HI @Rjdudley ,
Thanks for the suggestion, let me try this and see if it resolves my issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-27-2025 10:55 PM
The DLT pipeline needs to be configured to write the result to a delta table rather than a specific path.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-27-2025 11:31 PM
Hi Lakshay,
Could you please provide an example or any link where it's explained.
As I tried creating a delta table and use that in my DLT pipeline, but it didn't work, it gave an error that this table already exists.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2025 11:07 AM
tengo un inconveniente igual. no me gusta guardar con un nombre aleatorio dentro de __unitystorage java.lang.IllegalArgumentException: Cannot specify an explicit path for a table when using Unity Catalog. Remove the explicit path:

