We are trying to build an internal use case based on PySpark. The data we have requires a lot of pre-processing. Hence, to cater to that we have used custom Spark ML pipeline stages as some of the transformations that need to be done on our data aren...