- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2023 01:54 AM
Hello Community,
I am currently working on populating gold layer tables. Source for these gold layer tables are silver layer tables. A query is going to run on silver layer tables, spark sql query contains joins between multiple tables.
ex:
select columns
from table1
inner join table2
on join_condition
inner join table3 on join_condition
where clause.
Now my question is how can i load the data incrementally from the query?. i should be able to schedule the pipeline for every 30 mins.
Thanks for the help.
Thanks
Venkat
- Labels:
-
GoldLayer
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2023 02:46 AM
Hi @venkat,
You can use merge or upsert operation in databricks for the incremental load.
Yes you can schedule the job to run every 30 min by using databricks workflow.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2023 02:46 AM
Hi @venkat,
You can use merge or upsert operation in databricks for the incremental load.
Yes you can schedule the job to run every 30 min by using databricks workflow.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2023 03:23 AM
Hi @Ajay Pandey ,
Thanks for your reply,
I will try and let you know.
Thanks
Venkat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2023 03:41 AM
Sure

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2023 09:03 PM
Hi @bodempudi venkat
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!

