Hello,
I'm encountering an issue with a Python script/notebook that I have developed and used in a daily job ran in Databricks. It has worked perfectly fine for months, but now continues to fail constantly. After digging a little deeper, when running the notebook that the job was connected to, it would throw a "IOStream.flush timed out" warning message but would still continue to run endlessly and not perform the other operations in the script. All my script does is call the Databricks API for Query History to obtain the previous day's data, walk through each page of query history data, create a data frame from the data, and store it into a table. Typically, I'd see this complete in 5-10 minutes even for heavier volume days, but now, it'll run for hours on end and won't fail unless we set timeout limits. I'm assuming it has something to do with memory/resources, but I'm not sure how to resolve it.
Thank you!