cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Trigger Events in data pipeline

sanjay
Valued Contributor II

Hi,

I am running datapipeline in databrick using matillion architecture. I am facing inconsistent events in silver to gold layer in case any row deleted/updated from a partition. Let me explain with example.

e.g. I have data in silver layer with partition on department id & joining date. If lets assume there are 3 employees joined in dept 1 and joining date as 01 Oct 2023. So this data is available in silver layer. Now, If am updating one of the employee record, then events are generated for all the data in that partition in silver to gold layer i.e. am getting all 3 records as change even if updates are done on single record.

Here is my code

(spark.readStream.format("delta")
.option("useNotification","true")
.option("includeExistingFiles","true")
.option("allowOverwrites",True)
.option("ignoreMissingFiles",True)
.option("ignoreChanges","true")
.option("maxFilesPerTrigger", 100)
.load(silver_path)
.writeStream
.queryName("SilverGoldStream")
.option("checkpointLocation", gold_checkpoint_path)
.trigger(once=True)
.foreachBatch(foreachBatchFunction)
.start()
.awaitTermination()
)

Appreciate any help here.

Regards,

Sanjay

1 REPLY 1

sanjay
Valued Contributor II

Thank you Kaniz. 

Further queries on this.

1. If I have nested partitions e.g. on department & date, finance->09, finance->10 and if am updating one record in finance->09 then will then updates partition finance->10 as well

2. Is it good idea to have smaller partition to reduce impact of updates. What's maximum number of partitions I can have

Thanks,

Sanjay

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group