cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sensanjoy
by Contributor
  • 6645 Views
  • 5 replies
  • 1 kudos

Resolved! Performance issue with pyspark udf function calling rest api

Hi All,I am facing some performance issue with one of pyspark udf function that post data to REST API(uses cosmos db backend to store the data).Please find the details below: # The spark dataframe(df) contains near about 30-40k data. # I am using pyt...

  • 6645 Views
  • 5 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Sanjoy Sen​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 1 kudos
4 More Replies
sanjay
by Valued Contributor II
  • 5145 Views
  • 4 replies
  • 4 kudos

Resolved! PySpark UDF is taking long to process

Hi,I have UDF which runs for each spark dataframe row, does some complex processing and return string output. But it takes very long if data is 15000 rows. I have configured cluster with autoscaling, but its not spinning more servers.Please suggest h...

  • 5145 Views
  • 4 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @Sanjay Jain​ ​​, We haven't heard from you since the last response from @Lakshay Goel​, @rishabh and @Vigneshraja Palaniraj​​, and I was checking back to see if their suggestions helped you.Or else, If you have any solution, please share it with ...

  • 4 kudos
3 More Replies
Labels