<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I read all the files in a folder on S3 into several pandas dataframes? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-can-i-read-all-the-files-in-a-folder-on-s3-into-several/m-p/27321#M19198</link>
    <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hi @zhaoxuan210, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;Please go through the below answer,&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3" target="_blank"&gt;https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 27 Jan 2020 06:03:43 GMT</pubDate>
    <dc:creator>shyam_9</dc:creator>
    <dc:date>2020-01-27T06:03:43Z</dc:date>
    <item>
      <title>How can I read all the files in a folder on S3 into several pandas dataframes?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-read-all-the-files-in-a-folder-on-s3-into-several/m-p/27320#M19197</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;import pandas as pd&lt;/P&gt;
&lt;P&gt; import glob&lt;/P&gt;
&lt;P&gt; path = "s3://somewhere/" # use your path&lt;/P&gt;
&lt;P&gt; all_files = glob.glob(path + "/*.csv")&lt;/P&gt;
&lt;P&gt; print(all_files) &lt;/P&gt;
&lt;P&gt;li = []&lt;/P&gt;
&lt;P&gt; for filename in all_files:&lt;/P&gt;
&lt;P&gt; dfi = pd.read_csv(filename,names =['acct_id', 'SOR_ID'], dtype={'acct_id':str,'SOR_ID':str},header = None )&lt;/P&gt;
&lt;P&gt; li.append(dfi)&lt;/P&gt;
&lt;P&gt;I can read the file if I read one of them. But the glob is not working here. The all_files will return a empty [], how to get the list of the filenames as an array?&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jan 2020 17:12:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-read-all-the-files-in-a-folder-on-s3-into-several/m-p/27320#M19197</guid>
      <dc:creator>zhaoxuan210</dc:creator>
      <dc:date>2020-01-16T17:12:50Z</dc:date>
    </item>
    <item>
      <title>Re: How can I read all the files in a folder on S3 into several pandas dataframes?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-can-i-read-all-the-files-in-a-folder-on-s3-into-several/m-p/27321#M19198</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hi @zhaoxuan210, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;Please go through the below answer,&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3" target="_blank"&gt;https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Jan 2020 06:03:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-can-i-read-all-the-files-in-a-folder-on-s3-into-several/m-p/27321#M19198</guid>
      <dc:creator>shyam_9</dc:creator>
      <dc:date>2020-01-27T06:03:43Z</dc:date>
    </item>
  </channel>
</rss>

