<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Auto Loader with File Notification mode not picking up new files in Delta Live Tables pipeline in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/auto-loader-with-file-notification-mode-not-picking-up-new-files/m-p/97019#M39396</link>
    <description>&lt;P&gt;Dear,&lt;/P&gt;&lt;P&gt;I am developing a Delta Live Table pipeline and use Auto Loader with File Notification mode to pick up files inside an Azure storage account (which is &lt;U&gt;not&lt;/U&gt; the storage used by the default catalog). When I full refresh the target streaming table, all exisiting files will be processed. However, &lt;STRONG&gt;when I refresh the pipeline later on, new files are not picked up.&lt;/STRONG&gt; I am using DLT with Unity Catalog and the default managed catalog.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Looking at the storage queue, I see the following streamStatus:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Unknown.&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Reason: Failed to check the last update time of checkpoint directory abfss://unity-catalog-storage@&amp;lt;managed_storage_account&amp;gt;.dfs.core.windows.net/..., exception:&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;Failure to initialize configuration for storage account &amp;lt;managed_storage_account&amp;gt;..core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:715)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2100)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.&amp;lt;init&amp;gt;(AzureBlobFileSystemStore.java:272)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:239)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiABFS.initialize(LokiABFS.scala:36)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:168)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.FileSystemCache.getOrCompute(FileSystemCache.scala:43)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:164)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:258)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3611)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;org.apache.hadoop.fs.FileSystem.get(FileSystem.java:554)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.sql.fileNotification.autoIngest.ResourceManagementUtils$.getStreamStatus(ResourceManagementUtils.scala:62)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.sql.aqs.autoIngest.CloudFilesAzureResourceManager.$anonfun$listNotificationServices$1(CloudFilesAzureResourceManager.scala:78)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;scala.collection.immutable.List.map(List.scala:293)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.sql.aqs.autoIngest.CloudFilesAzureResourceManager.listNotificationServices(CloudFilesAzureResourceManager.scala:74)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;java.lang.reflect.Method.invoke(Method.java:498)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.Gateway.invoke(Gateway.java:306)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.ClientServerConnection.run(ClientServerConnection.java:119)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;java.lang.Thread.run(Thread.java:750)&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;What I already did:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Transferred ownership of the DLT pipeline to the service principal (SP)&lt;/LI&gt;&lt;LI&gt;Granted the SP access to the default catalog's external location (where checkpoint is located)&lt;/LI&gt;&lt;LI&gt;Double check that the has ownership of&amp;nbsp;&lt;/LI&gt;&lt;LI&gt;Double checked that the SP has Contributor, EventGrid EventSubscription Contributor and Storage Queue Data Contributor roles:&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="rvo19941_0-1730383733629.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/12441i5ABD0C6F7FA04D97/image-size/medium?v=v2&amp;amp;px=400" role="button" title="rvo19941_0-1730383733629.png" alt="rvo19941_0-1730383733629.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 31 Oct 2024 15:52:43 GMT</pubDate>
    <dc:creator>rvo19941</dc:creator>
    <dc:date>2024-10-31T15:52:43Z</dc:date>
    <item>
      <title>Auto Loader with File Notification mode not picking up new files in Delta Live Tables pipeline</title>
      <link>https://community.databricks.com/t5/data-engineering/auto-loader-with-file-notification-mode-not-picking-up-new-files/m-p/97019#M39396</link>
      <description>&lt;P&gt;Dear,&lt;/P&gt;&lt;P&gt;I am developing a Delta Live Table pipeline and use Auto Loader with File Notification mode to pick up files inside an Azure storage account (which is &lt;U&gt;not&lt;/U&gt; the storage used by the default catalog). When I full refresh the target streaming table, all exisiting files will be processed. However, &lt;STRONG&gt;when I refresh the pipeline later on, new files are not picked up.&lt;/STRONG&gt; I am using DLT with Unity Catalog and the default managed catalog.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Looking at the storage queue, I see the following streamStatus:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Unknown.&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Reason: Failed to check the last update time of checkpoint directory abfss://unity-catalog-storage@&amp;lt;managed_storage_account&amp;gt;.dfs.core.windows.net/..., exception:&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;Failure to initialize configuration for storage account &amp;lt;managed_storage_account&amp;gt;..core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:715)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2100)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.&amp;lt;init&amp;gt;(AzureBlobFileSystemStore.java:272)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:239)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiABFS.initialize(LokiABFS.scala:36)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:168)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.FileSystemCache.getOrCompute(FileSystemCache.scala:43)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:164)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:258)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3611)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;org.apache.hadoop.fs.FileSystem.get(FileSystem.java:554)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.sql.fileNotification.autoIngest.ResourceManagementUtils$.getStreamStatus(ResourceManagementUtils.scala:62)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.sql.aqs.autoIngest.CloudFilesAzureResourceManager.$anonfun$listNotificationServices$1(CloudFilesAzureResourceManager.scala:78)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;scala.collection.immutable.List.map(List.scala:293)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;com.databricks.sql.aqs.autoIngest.CloudFilesAzureResourceManager.listNotificationServices(CloudFilesAzureResourceManager.scala:74)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;java.lang.reflect.Method.invoke(Method.java:498)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.Gateway.invoke(Gateway.java:306)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.commands.CallCommand.execute(CallCommand.java:79)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;py4j.ClientServerConnection.run(ClientServerConnection.java:119)&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;java.lang.Thread.run(Thread.java:750)&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;What I already did:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Transferred ownership of the DLT pipeline to the service principal (SP)&lt;/LI&gt;&lt;LI&gt;Granted the SP access to the default catalog's external location (where checkpoint is located)&lt;/LI&gt;&lt;LI&gt;Double check that the has ownership of&amp;nbsp;&lt;/LI&gt;&lt;LI&gt;Double checked that the SP has Contributor, EventGrid EventSubscription Contributor and Storage Queue Data Contributor roles:&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="rvo19941_0-1730383733629.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/12441i5ABD0C6F7FA04D97/image-size/medium?v=v2&amp;amp;px=400" role="button" title="rvo19941_0-1730383733629.png" alt="rvo19941_0-1730383733629.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2024 15:52:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/auto-loader-with-file-notification-mode-not-picking-up-new-files/m-p/97019#M39396</guid>
      <dc:creator>rvo19941</dc:creator>
      <dc:date>2024-10-31T15:52:43Z</dc:date>
    </item>
    <item>
      <title>Re: Auto Loader with File Notification mode not picking up new files in Delta Live Tables pipeline</title>
      <link>https://community.databricks.com/t5/data-engineering/auto-loader-with-file-notification-mode-not-picking-up-new-files/m-p/98614#M39759</link>
      <description>&lt;P&gt;Based on the error "&lt;EM&gt;Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key&lt;/EM&gt;", the pipeline was still trying to use an account key authentication method instead of service principal authentication. Can we see your autoloader code?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Nov 2024 06:33:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/auto-loader-with-file-notification-mode-not-picking-up-new-files/m-p/98614#M39759</guid>
      <dc:creator>SparkJun</dc:creator>
      <dc:date>2024-11-13T06:33:44Z</dc:date>
    </item>
  </channel>
</rss>

