08-02-2018 12:09 AM
I have a pyspark dataframe df containing 4 columns. How can I write this dataframe to s3 bucket?
I'm using pycharm to execute the code. and what are the packages required to be installed?
08-04-2018 04:16 AM
You shouldn't need any packages. You can mount S3 bucket to Databricks cluster.
https://docs.databricks.com/spark/latest/data-sources/aws/amazon-s3.html#mount-aws-s3
or this
http://www.sparktutorials.net/Reading+and+Writing+S3+Data+with+Apache+Spark
never-displayed
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.