<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20460#M13801</link>
    <description>&lt;P&gt;Currently, DLT is running on runtime 10.3. Once it is 10.5 or higher, it should be possible.&lt;/P&gt;</description>
    <pubDate>Sun, 03 Jul 2022 18:13:52 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-07-03T18:13:52Z</dc:date>
    <item>
      <title>Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20455#M13796</link>
      <description>&lt;P&gt;We are building a delta live pipeline where we ingest csv files in AWS S3 using cloudFiles. &lt;/P&gt;&lt;P&gt;And it is necessary to access the file modification timestamp of the file. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/data/file-metadata-column.html" alt="https://docs.databricks.com/data/file-metadata-column.html" target="_blank"&gt;As documented here&lt;/A&gt;, we tried selecting `_metadata` column in a task in delta live pipelines without success. Are we doing something wrong?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The code snippet is below:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;@dlt.table(
    name = "bronze",
    comment = f"New {SCHEMA} data incrementally ingested from S3",
    table_properties = {
        "quality": "bronze"
    }
)
def bronze_job():
    return spark \
            .readStream \
            .format("cloudFiles") \
            .option("cloudFiles.useNotifications", "true") \
            .option("cloudFiles.format", "csv") \
            .option("cloudFiles.region", "eu-west-1") \
            .option("delimiter", ",") \
            .option("escape", "\"") \
            .option("header", "false") \
            .option("encoding", "UTF-8") \
            .schema(cdc_schema) \
            .load("/mnt/%s/cdc/%s" % (RAW_MOUNT_NAME, SCHEMA)) \
            .select("*", "_metadata")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Tejas&lt;/P&gt;</description>
      <pubDate>Mon, 16 May 2022 17:07:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20455#M13796</guid>
      <dc:creator>tej1</dc:creator>
      <dc:date>2022-05-16T17:07:45Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20456#M13797</link>
      <description>&lt;P&gt;Are you using Databricks Runtime 10.5?&lt;/P&gt;</description>
      <pubDate>Tue, 17 May 2022 09:35:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20456#M13797</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-05-17T09:35:19Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20457#M13798</link>
      <description>&lt;P&gt;Yes, on a standalone cluster (for any cluster outside of the DLT pipeline) this feature works using DR 10.5. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I found out the issue. &lt;A href="https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-api-guide.html#pipelinesnewcluster" alt="https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-api-guide.html#pipelinesnewcluster" target="_blank"&gt;We cannot choose run time (unable to set `spark_version`) in DLT pipeline settings. &lt;span class="lia-unicode-emoji" title=":tired_face:"&gt;😫&lt;/span&gt; &lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 17 May 2022 12:42:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20457#M13798</guid>
      <dc:creator>tej1</dc:creator>
      <dc:date>2022-05-17T12:42:54Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20459#M13800</link>
      <description>&lt;P&gt;I'm having the same problem. Does this answer mean that there is no way to get file metadata using Delta Live Tables?&lt;/P&gt;</description>
      <pubDate>Sat, 02 Jul 2022 18:42:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20459#M13800</guid>
      <dc:creator>colt</dc:creator>
      <dc:date>2022-07-02T18:42:22Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20460#M13801</link>
      <description>&lt;P&gt;Currently, DLT is running on runtime 10.3. Once it is 10.5 or higher, it should be possible.&lt;/P&gt;</description>
      <pubDate>Sun, 03 Jul 2022 18:13:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20460#M13801</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-07-03T18:13:52Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble accessing `_metadata` column using cloudFiles in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20461#M13802</link>
      <description>&lt;P&gt;Update: &lt;/P&gt;&lt;P&gt;We were able to test `_metadata` column feature in DLT "preview" mode (which is DBR 11.0). Databricks doesn't recommend production workloads when using "preview" mode, but nevertheless, glad to be using this feature in DLT.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Aug 2022 12:54:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/trouble-accessing-metadata-column-using-cloudfiles-in-delta-live/m-p/20461#M13802</guid>
      <dc:creator>tej1</dc:creator>
      <dc:date>2022-08-03T12:54:25Z</dc:date>
    </item>
  </channel>
</rss>

