05-16-2022 10:07 AM
We are building a delta live pipeline where we ingest csv files in AWS S3 using cloudFiles.
And it is necessary to access the file modification timestamp of the file.
As documented here, we tried selecting `_metadata` column in a task in delta live pipelines without success. Are we doing something wrong?
The code snippet is below:
@dlt.table(
name = "bronze",
comment = f"New {SCHEMA} data incrementally ingested from S3",
table_properties = {
"quality": "bronze"
}
)
def bronze_job():
return spark \
.readStream \
.format("cloudFiles") \
.option("cloudFiles.useNotifications", "true") \
.option("cloudFiles.format", "csv") \
.option("cloudFiles.region", "eu-west-1") \
.option("delimiter", ",") \
.option("escape", "\"") \
.option("header", "false") \
.option("encoding", "UTF-8") \
.schema(cdc_schema) \
.load("/mnt/%s/cdc/%s" % (RAW_MOUNT_NAME, SCHEMA)) \
.select("*", "_metadata")
Thanks.
Tejas
05-17-2022 05:42 AM
Yes, on a standalone cluster (for any cluster outside of the DLT pipeline) this feature works using DR 10.5.
I found out the issue. We cannot choose run time (unable to set `spark_version`) in DLT pipeline settings. 😫
05-17-2022 02:35 AM
Are you using Databricks Runtime 10.5?
05-17-2022 05:42 AM
Yes, on a standalone cluster (for any cluster outside of the DLT pipeline) this feature works using DR 10.5.
I found out the issue. We cannot choose run time (unable to set `spark_version`) in DLT pipeline settings. 😫
07-02-2022 11:42 AM
I'm having the same problem. Does this answer mean that there is no way to get file metadata using Delta Live Tables?
07-03-2022 11:13 AM
Currently, DLT is running on runtime 10.3. Once it is 10.5 or higher, it should be possible.
08-03-2022 05:54 AM
Update:
We were able to test `_metadata` column feature in DLT "preview" mode (which is DBR 11.0). Databricks doesn't recommend production workloads when using "preview" mode, but nevertheless, glad to be using this feature in DLT.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group