01-02-2023 05:12 AM
When using Delta Live Tables with SQL-syntax ignoreMissingFiles-option is not working and you are getting error anyway.(See picture below)
Link to feature:
https://learn.microsoft.com/en-us/azure/databricks/ingestion/auto-loader/options#generic-options
Here is the SQL –code for Bronze -layer
CREATE OR REFRESH TEMPORARY STREAMING LIVE TABLE bronze_InventSite
AS
SELECT
_c0 as LastProcessedChange_DateTime
,_c1 as DataLakeModified_DateTime
,_c2 as Start_LSN
,_c3 as End_LSN
,_c4 as DML_Action
,_c5 as Seq_Val
,_c6 as Update_Mask
,_c7 as RECID
,_c8 as DefaultDimension
,_c9 as DefaultInventStatusId
,_c10 as Name
,_c11 as OrderEntryDeadlineGroupId
,_c12 as SiteId
,_c13 as TaxBranchRefRecId
,_c14 as Timezone
,_c15 as IsReceivingWarehouseOverrideAllowed
,_c16 as DataAreaId
,_c17 as PARTITION
,_c18 as RECVERSION
,_c19 as SYSROWVERSIONNUMBER
FROM cloud_files("/mnt/d365fscm/ChangeFeed/InventSite", "csv",map("delimiter", ",", "header", "false", "ignoreMissingFiles", "true"))
And here is SQL- code for SCD2 table:
CREATE OR REFRESH STREAMING LIVE TABLE silver_InventSite
;
APPLY CHANGES INTO
live.silver_InventSite
FROM
stream(live.bronze_InventSite)
KEYS
(RECID)
IGNORE NULL UPDATES
SEQUENCE BY
Start_LSN /*Start_LSN*/
STORED AS
SCD TYPE 2;
01-02-2023 07:28 AM
You could also use inferSchema. ignoreMissingFiles option is to handle files that were accidentally deleted before being fully processed, so it has nothing related to the schema.
01-02-2023 05:41 AM
It looks like AutoLoader needs to have schema even if ignoreMissingFiles is set to true (if there's no file then it can't read one).
Try to specify schema using schemaLocation.
01-02-2023 05:51 AM
Thank you, I get it work like this:
CREATE OR REFRESH TEMPORARY STREAMING LIVE TABLE bronze_InventSite
AS
SELECT
_c0 as LastProcessedChange_DateTime
,_c1 as DataLakeModified_DateTime
,_c2 as Start_LSN
,_c3 as End_LSN
,_c4 as DML_Action
,_c5 as Seq_Val
,_c6 as Update_Mask
,_c7 as RECID
,_c8 as DefaultDimension
,_c9 as DefaultInventStatusId
,_c10 as Name
,_c11 as OrderEntryDeadlineGroupId
,_c12 as SiteId
,_c13 as TaxBranchRefRecId
,_c14 as Timezone
,_c15 as IsReceivingWarehouseOverrideAllowed
,_c16 as DataAreaId
,_c17 as PARTITION
,_c18 as RECVERSION
,_c19 as SYSROWVERSIONNUMBER
FROM cloud_files("/mnt/d365fscm/ChangeFeed/InventSite", "csv",map("delimiter", ",", "header", "false","schema","_c0 STRING,_c1 STRING,_c2 STRING,_c3 STRING,_c4 STRING,_c5 STRING,_c6 STRING,_c7 STRING,_c8 STRING,_c9 STRING,_c10 STRING,_c11 STRING,_c12 STRING,_c13 STRING,_c14 STRING,_c15 STRING,_c16 STRING,_c17 STRING,_c18 STRING,_c19 STRING"))
01-02-2023 07:28 AM
You could also use inferSchema. ignoreMissingFiles option is to handle files that were accidentally deleted before being fully processed, so it has nothing related to the schema.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group