HI
I attempted to parallelize my Spark read process by setting the default parallelism using spark.conf.set("spark.default.parallelism", "X"). However, despite setting this configuration, when I checked sc.defaultParallelism in my notebook, it displayed 64. Interestingly, the job still consisting each stage having the default 200 tasks. How can I further increase this parallelism where is the value 200 taking from? The source is a python list of s3 files and json formatted.