- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a month ago
Hi All, I am trying to read csv files from one Folder of S3 bucket. For this particular used case, I do not intent to read from sub folders. I am using below code however its reading all CSVs in sub folders as well. How can i avoid that?
I used many different versions of below code with help of Chatgpt but none of them seems working. Any help?
def source_config():
src_path = BASE_S3_URI.rstrip("/")
options = {
"cloudFiles.format": "csv",
"cloudFiles.schemaLocation": SCHEMA_LOCATION,
"cloudFiles.inferColumnTypes": "true",
"cloudFiles.schemaEvolutionMode": "addNewColumns",
"cloudFiles.includeExistingFiles": "true",
"cloudFiles.useNotifications": "false",
"pathGlobFilter": "*.csv",
"header": "true",
"delimiter": ",",
"quote": "\"",
"multiLine": "false",
# optional (can keep during debugging)
"badRecordsPath": f"{SCHEMA_LOCATION}/bad_records",
"columnNameOfCorruptRecord": "_corrupt_record",
"cloudFiles.rescuedDataColumn": "_rescued_data",
}
return src_path, options