cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

IOStream.flush Timed Out

dbengineer516
New Contributor III

Hello,

I'm encountering an issue with a Python script/notebook that I have developed and used in a daily job ran in Databricks. It has worked perfectly fine for months, but now continues to fail constantly. After digging a little deeper, when running the notebook that the job was connected to, it would throw a "IOStream.flush timed out" warning message but would still continue to run endlessly and not perform the other operations in the script. All my script does is call the Databricks API for Query History to obtain the previous day's data, walk through each page of query history data, create a data frame from the data, and store it into a table. Typically, I'd see this complete in 5-10 minutes even for heavier volume days, but now, it'll run for hours on end and won't fail unless we set timeout limits. I'm assuming it has something to do with memory/resources, but I'm not sure how to resolve it.

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

raphaelblg
Databricks Employee
Databricks Employee

Hello @dbengineer516 

From my research it looks to be an IPython cache error. Maybe your python REPL is getting throttled due to too many requests.

Please check: https://github.com/ipython/ipykernel/issues/334

This comment seems to be a possible solution: https://github.com/ipython/ipykernel/issues/334#issuecomment-1357140493

I hope it helps 🙂

 

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post

1 REPLY 1

raphaelblg
Databricks Employee
Databricks Employee

Hello @dbengineer516 

From my research it looks to be an IPython cache error. Maybe your python REPL is getting throttled due to too many requests.

Please check: https://github.com/ipython/ipykernel/issues/334

This comment seems to be a possible solution: https://github.com/ipython/ipykernel/issues/334#issuecomment-1357140493

I hope it helps 🙂

 

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group