<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Autoloader directory listing not listing all files in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6881#M2883</link>
    <description>&lt;P&gt;Hi @Fabrice Deseyn​&amp;nbsp;, My understanding is that this could be because Autoloader returns a fixed no. of results per API call as explained here: &lt;A href="https://docs.databricks.com/ingestion/auto-loader/directory-listing-mode.html#how-does-directory-listing-mode-work" target="test_blank"&gt;https://docs.databricks.com/ingestion/auto-loader/directory-listing-mode.html#how-does-directory-listing-mode-work&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 29 Mar 2023 17:42:54 GMT</pubDate>
    <dc:creator>Lakshay</dc:creator>
    <dc:date>2023-03-29T17:42:54Z</dc:date>
    <item>
      <title>Autoloader directory listing not listing all files</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6877#M2879</link>
      <description>&lt;P&gt;Hi community&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have an Autoloader pipeline running with following configuration. Unfortunately, it does not detect all files. (see below query definition).&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/454iCBE0D4783FA0DED2/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/463i0A6A86D52B6D036E/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The folder that needs to be read has 38.246 files that all have the same schema and structure.: &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/464i6A17EC49FB405FBB/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;If I look at the `cloud_files_state`, I see get 4999 files. Do I do something wrong? Is this an 'initial' count that will be increased if these files are ingested? &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/458iFC4253C1F64BBD59/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2023 12:14:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6877#M2879</guid>
      <dc:creator>FabriceDeseyn</dc:creator>
      <dc:date>2023-03-29T12:14:42Z</dc:date>
    </item>
    <item>
      <title>Re: Autoloader directory listing not listing all files</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6878#M2880</link>
      <description>&lt;P&gt;@Fabrice Deseyn​&amp;nbsp;&lt;/P&gt;&lt;P&gt;It looks like your storage is not prepared for incremental listing. Use normal Directory Listing to get all of the files.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/ingestion/auto-loader/directory-listing-mode.html#date-partitioned-files" target="test_blank"&gt;https://docs.databricks.com/ingestion/auto-loader/directory-listing-mode.html#date-partitioned-files&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2023 12:47:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6878#M2880</guid>
      <dc:creator>daniel_sahal</dc:creator>
      <dc:date>2023-03-29T12:47:04Z</dc:date>
    </item>
    <item>
      <title>Re: Autoloader directory listing not listing all files</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6879#M2881</link>
      <description>&lt;P&gt;@Daniel Sahal​&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks, apparently I looked over the useIncrementalListing setting... big mistake from my side...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for the second pair of eyes!&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2023 13:19:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6879#M2881</guid>
      <dc:creator>FabriceDeseyn</dc:creator>
      <dc:date>2023-03-29T13:19:20Z</dc:date>
    </item>
    <item>
      <title>Re: Autoloader directory listing not listing all files</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6880#M2882</link>
      <description>&lt;P&gt;So, I changed the code (also tried with False) and have put a new checkpoint. This however still keeps giving me the same amount of files.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image.png"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/462iC4BF57B15F611A62/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2023 13:45:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6880#M2882</guid>
      <dc:creator>FabriceDeseyn</dc:creator>
      <dc:date>2023-03-29T13:45:50Z</dc:date>
    </item>
    <item>
      <title>Re: Autoloader directory listing not listing all files</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6881#M2883</link>
      <description>&lt;P&gt;Hi @Fabrice Deseyn​&amp;nbsp;, My understanding is that this could be because Autoloader returns a fixed no. of results per API call as explained here: &lt;A href="https://docs.databricks.com/ingestion/auto-loader/directory-listing-mode.html#how-does-directory-listing-mode-work" target="test_blank"&gt;https://docs.databricks.com/ingestion/auto-loader/directory-listing-mode.html#how-does-directory-listing-mode-work&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2023 17:42:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6881#M2883</guid>
      <dc:creator>Lakshay</dc:creator>
      <dc:date>2023-03-29T17:42:54Z</dc:date>
    </item>
    <item>
      <title>Re: Autoloader directory listing not listing all files</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6882#M2884</link>
      <description>&lt;P&gt;Hi @Fabrice Deseyn​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for posting your question in our community! We are happy to assist you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2023 07:39:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-directory-listing-not-listing-all-files/m-p/6882#M2884</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-03-30T07:39:12Z</dc:date>
    </item>
  </channel>
</rss>

