โ09-10-2021 12:43 PM
Hello everyone.
I have a process on databricks when I need to upload a CSV file everyday manually.
I would like to know if there is a way to import this data (as panda in python, for example) with no necessary to upload this file everyday manually utilizing UI.
Thanks a lot.
โ09-10-2021 02:39 PM
You can use our DBFS Put REST API endpoint. Note that this uploads the data but you would still probably need a Databricks job to load the data into a table.
You could also use the REST APIs or Python client for the cloud storage you are using as well.
โ09-10-2021 02:39 PM
You can use our DBFS Put REST API endpoint. Note that this uploads the data but you would still probably need a Databricks job to load the data into a table.
You could also use the REST APIs or Python client for the cloud storage you are using as well.
โ09-11-2021 01:50 AM
Hi Rodrigo
if you're using Azure databricks you can also try Autoloader this will upload you data into a spark dataframe or directly into a panda's dataframe. Be aware that you'll need setup your workspace with the necessary permissions.
Here a link with more info Ingest CSV data with Auto Loader - Azure Databricks - Workspace | Microsoft Docs
Happy learning!
โ09-13-2021 06:57 AM
Autoloader is indeed a valid option,
or use of some kind of ETL tool which fetches the file and put it somewhere on your cloud provider, like Azure Data Factory or AWS Glue etc.
Excited to expand your horizons with us? Click here to Register and begin your journey to success!
Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!