cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Add Raw Tabless to Unity Catalog?

Ali_Ahmad
New Contributor II

Hello everyone,

Hope someone here can help me out as I am a bit stuck 🙂

I am currently loading some data from number of sources to parquet using ADF. This ends up in a ADLS2 storage account as .parquet. This is the staging / landing zone area.

From there I am using a databricks notebook to move it from staging to bronze while converting to delta.

What I am trying to do is add those tables using the autoloader so that I can "get it into the catalog" and refer to it in dbt when working from bronze, to silver to gold. I can browse fine, but the moment I try to preview there is an issue.

The error I get is:

Your request failed with status FAILED: [BAD_REQUEST] Reading from a Delta table is not supported with this syntax. If you would like to consume data from Delta, please refer to the docs: read a Delta table (https://docs.microsoft.com/azure/databricks/delta/tutorial#read), or read a Delta table as a stream source (https://docs.microsoft.com/azure/databricks/structured-streaming/delta-lake#table-streaming-reads-an...). The streaming source from Delta is already optimized for incremental consumption of data.

And after that I can't get any further because I can't preview the table to add it. 
 
Any input or thoughts would be great and thanks for any responses!
2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Ali_Ahmad

I understand your situation. It seems you’re trying to use Databricks Autoloader to read Delta Lake files, which is causing the error.

The issue here is that Autoloader and Delta Lake are not fully compatible. Autoloader is designed to identify what you have already processed, and Delta Lake is more than just.... Autoloader supports formats like JSON, CSV, PARQUET, AVRO, ORC, TEXT, and BINARYFILE1, but not Delta.

As the error message suggests, you should use ‘format(“delta”)’ when reading and writing to a Delta .... If you want to disable the format check, you can use the following command:

SET spark.databricks.delta.formatCheck.enabled=false

For incremental consumption of data, you might want to consider using Delta’s change data feed1. This way, you do not have to read the whole Delta table but only ingest changes.

I hope this helps! Let me know if you have any other questions. 😊

 

Ali_Ahmad
New Contributor II

Thank you for your response @Kaniz .

I resolved to adding the tables using this type of script:

%sql
-- Create table in Metastore for Bronze Tables
CREATE TABLE xxx_dataplatform_dev_dbx.bronze.x_x
(
    date STRING,
    orgid LONG,
    org STRING,
    advertiserid LONG,
    advertiser STRING,
    campaignid LONG,
    campaign STRING,
    adgroupid LONG,
    adgroup STRING,
    siteid LONG,
    site STRING,
    platformid LONG,
    platform STRING,
    imps LONG,
    clicks LONG,
    media_spend DOUBLE,
    tech_spend STRING,
    tech_fee DOUBLE,
    currency STRING,
    sourceSystem STRING,
    loadDate STRING
)
USING DELTA
LOCATION 'abfss://bronze@dataplatformdevadls.dfs.core.windows.net/X/XX/';
 
And that works well. I will look into what you mention here and see if it gives a better workflow experience, otherwise this was fine!
 
Appreciate you responding to my question 🙂

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!