โ08-02-2018 12:09 AM
I have a pyspark dataframe df containing 4 columns. How can I write this dataframe to s3 bucket?
I'm using pycharm to execute the code. and what are the packages required to be installed?
โ08-04-2018 04:16 AM
You shouldn't need any packages. You can mount S3 bucket to Databricks cluster.
https://docs.databricks.com/spark/latest/data-sources/aws/amazon-s3.html#mount-aws-s3
or this
http://www.sparktutorials.net/Reading+and+Writing+S3+Data+with+Apache+Spark
never-displayed
Excited to expand your horizons with us? Click here to Register and begin your journey to success!
Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!