Hi Everyone ,I am trying to run a databricks notebook in parallel using ThreadPoolExecutor .Can anyone suggest how to reduce the time taken based on the below findings so far.Current Performance:Time taken - 25 minutes ThreadPoolExecutor max_workers ...
How to convert the rows of a spark dataframe to list without using Pandas.Input Spark Dataframe :Expected Output:[['A','B','C'],['1','2','3'],['4','5','6'],['7','8','9']]
Hi @Hubert Dudek ,I have a similar requirement where I am trying to query a table in Databricks by passing a parameter from Power BI report builder. So I have two queries out of which one is working and the other is not working.Can you help in ident...
Hi @Hubert Dudek ,You have mentioned that ThreadPoolExecutor will not help , so if I want to run a same databricks notebooks for 100 different input values and running them in sequence takes more time to complete.So how to achieve this scenario?
Hi @Leszek ,After going through the link that you shared and exploring further I found that it is best suited for I/O operations.But mine is CPU bound operations where lot of computations takes place and one more thing is that I need to run my note...
Hi Hubert ,As you have mentioned that it can not be used for everything , in my case also it doesn't suit as I have a lot variables declaration and having a function created for each variable doesn't look good.