Tharun-Kumar
Databricks Employee
Databricks Employee

Hi @NithinTiruveedh An alternate solution to achieve this would be to use the NTILE() function. 

For your use case, you have to perform ntile(5) which will split your dataset of 5M rows into 5 groups of 1M rows each.