Hi,
I am facing an issue where one of my jobs taking so long since certain time, previously its only needs less than 1 hour to run a batch job that load json data and do a truncate and load to a delta table, but since june 2nd, it become so long that it takes more than 2 hours (even 3 sometime) until its done.
I'm just curious how it can happen because I've not changes anything to the code and the data is just increasing around 5% per day. One of the thing that I suspect is maybe because the amount of column (90+) that I have on that data does not fit to columnar approach of delta table? CMIIW.
I've attach the images, first 2 is the typical previous runtime before june 2nd, last 2 is the typical current runtime since june 2nd.
Please let me know if you have any idea ya.
Thank you!