Hi @Simha , This is expected behavior. Spark always creates an output directory when writing the data and it divides the result into multiple part files. This is because multiple executors write the result into the output directory. We cannot make the spark write the file without creating the output directory.
But we can control the no. of part files that are written in output directory by using the coalesce function. To get a single file output, you can use coalesce(1) while doing the write operation. However, I would advise you to decide the coalesce partition carefully as coalesce(1) would bring all the data to single executor and if the data volume is huge, this can lead to executor going OOM.