code example
# a list of file path
list_files_path = ["/dbfs/mnt/...", ..., "/dbfs/mnt/..."]
# copy all file above to this folder
dest_path=""/dbfs/mnt/..."
for file_path in list_files_path:
# copy function
copy_file(file_path, dest_path)
I am running it in the azure databrick and it works fine. But I am wondering if I can utilize the power of parallel of cluster in the databrick.
I know that I can run the some kind of multi-threading in the master node but I am wondering if I can use pandas_udf to take advantage of work nodes as well.
Thanks!