cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

yutaro_ono1_558
by New Contributor II
  • 9455 Views
  • 5 replies
  • 1 kudos

Resolved! How to read data from S3 Access Point by pyspark?

I want to read data from s3 access point.I successfully accessed using boto3 client to data through s3 access point.s3 = boto3.resource('s3')ap = s3.Bucket('arn:aws:s3:[region]:[aws account id]:accesspoint/[S3 Access Point name]')for obj in ap.object...

  • 9455 Views
  • 5 replies
  • 1 kudos
Latest Reply
shrestha-rj
New Contributor II
  • 1 kudos

I'm reaching out to seek assistance as I navigate an issue. Currently, I'm trying to read JSON files from an S3 Multi-Region Access Point using a Databricks notebook. While reading directly from the S3 bucket presents no challenges, I encounter an "j...

  • 1 kudos
4 More Replies
zhaoxuan210
by New Contributor
  • 20762 Views
  • 1 replies
  • 0 kudos

How can I read all the files in a folder on S3 into several pandas dataframes?

import pandas as pd import glob path = "s3://somewhere/" # use your path all_files = glob.glob(path + "/*.csv") print(all_files) li = [] for filename in all_files: dfi = pd.read_csv(filename,names =['acct_id', 'SOR_ID'], dtype={'acct_id':str,...

  • 20762 Views
  • 1 replies
  • 0 kudos
Latest Reply
shyam_9
Valued Contributor
  • 0 kudos

Hi @zhaoxuan210, Please go through the below answer,https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3

  • 0 kudos
Labels