โ01-30-2024 07:27 PM
Hi! So we have a live google sheets data that gets updated on an hourly/daily basis, and we want to bring it to databricks as a live/scheduled connection for further analysis, together with other tables and views present there. Do you have any suggestions or recommendations to approach this matter?
โ01-30-2024 08:44 PM
Yes, you can do it only you have to follow below steps -
1. Open Google sheet from which you want fetch data
2. Go to -File->Share->Publish to web- Select sheet->Select Sheet ->select .csv
4. Click on publish ->copy the URL
5. Use below code to fetch the live data
import pandas as pd
pdDF = pd.read_csv("URL which you have copied")
#Convert to spark DF
sparkDF = spark.createDataFrame(pdDF)
โ01-30-2024 09:06 PM
Thanks @Ajay-Pandey ! Appreciate your reply. I am new to Databricks, apologies, but I wonder if it's possible to put this live data into a table under a specific catalog-schema? Such that the table will reflect the live data in google sheet?
โ02-02-2024 12:27 AM
Yes you can load the live data in delta tables, but data only be refreshed once you run the related jobs.
Also for this use case you can go with databricks workflow that will help you to schedule the table refresh
โ09-24-2024 01:07 PM
Hi @Ajay-Pandey, I have similar question. I would like to import the Google Sheet content to Databricks but don't want to expose the content to the public. Is there a way to do it other than public the sheet? Thank you1
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group