Hi,I am facing the same issue and I choose All Purpose cluster.In my case, I need some jobs to run every 10 minutes.- With All Purpose cluster, it take 1 minute- With Job cluster, it takes 5 minutes (4 minutes for setting up + 1 minutes to perform t...
Hi,On the top right corner, there is a circle icon which is next to the bell.Click on it, then choose 'Profile'.When you go to Profile page, choose 'Edit' on the top right corner.
Hi,I think you can follow these steps:1. Use window function to create a new column by shifting, then your df will look like thisid value lag1 A-B-C-D-E-F null2 A-B-G-C-D-E-F A-B-C-D-E-F3 A-B-G-D-E-F ...
Hi,I agree with Werners, try to avoid loop with Pyspark Dataframe.If your dataframe is small, as you said, only about 1000 rows, you may consider to use Pandas.Thanks.