How can I read all the files in a folder on S3 into several pandas dataframes?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-16-2020 09:12 AM
import pandas as pd
import glob
path = "s3://somewhere/" # use your path
all_files = glob.glob(path + "/*.csv")
print(all_files)
li = []
for filename in all_files:
dfi = pd.read_csv(filename,names =['acct_id', 'SOR_ID'], dtype={'acct_id':str,'SOR_ID':str},header = None )
li.append(dfi)
I can read the file if I read one of them. But the glob is not working here. The all_files will return a empty [], how to get the list of the filenames as an array?
Labels:
- Labels:
-
Read from s3
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-26-2020 10:03 PM
Hi @zhaoxuan210,
Please go through the below answer,https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3