cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

DLT piplines with UC

ws4100e
New Contributor II

I try to run a (very simple) DLT pipeline in with a resulting materialized table is published in UC schema with a managed storage location defined (within an existing EXTERNAL LOCATION). Accoding to the documentation: 

  • Publishing to schemas that specify a managed storage location is supported only in the preview channel.

So - with no surprises - running this pipeline in the current channel ends with:

com.databricks.pipelines.common.CustomException: [DLT ERROR CODE: EXECUTION_SERVICE_STARTUP_FAILURE] 
Schemas with specified storage locations are not currently supported for UC enabled pipelines.

 But when I switch to the preview channel, it does not work either, but this time the error is different:

com.databricks.pipelines.common.CustomException: [DLT ERROR CODE: EXECUTION_SERVICE_STARTUP_FAILURE] 
Schema tstexternal has a specified storage location that does not match the pipeline's root location: None

Is there a chance for it to make it working? Can I setup "the pipeline's root location" somehow?

Any help welcome.

Thx

Pawel

6 REPLIES 6

data-engineer-d
New Contributor III

@ws4100e Did you select the target Catalog and Schema from the pipeline settings?
For persisting on UC managed schemas, currently we need to specify schema and select catalog. 

Yes, this is exactly what I did - I selected a catalog and a schema (with a storage location assigned) within it.

data-engineer-d
New Contributor III

I was receiving the same error however it was resolved after selecting right schema and right permissions while creating fresh pipeline. 

Can you please share the code writing the table?

ws4100e
New Contributor II

I try this for the simplest possible pipeline - I use Autoloader to read CSV files from Azure storage. I've got sufficient privileges on selected schema.

 

CREATE OR REFRESH STREAMING LIVE TABLE employees_raw_el
AS SELECT *
FROM cloud_files(
  "abfss://container1@sa12345678.dfs.core.windows.net/csvs/",
  "csv",
  map(
    "header", "true",    
    "delimiter", ";",
    "inferSchema", "true"
  )
);

CREATE OR REFRESH LIVE TABLE employees_bronze_el
COMMENT "Test external location"
TBLPROPERTIES ("table.usage" = "tests")
AS SELECT *
  FROM live.employees_raw_el;

 

DataGeek_JT
New Contributor II

Did this get resolved?  I am getting the same issue.

ws4100e
New Contributor II

Unfortunately not :(. 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.