Databricks Community

de-qrosh · 11-03-2024

Hello,In the past I used rdd.mapPartitions(lambda ...) to call functions that access third party APIs like azure ai translate text to batch call the API and return the batched data.How would one do this now?

de-qrosh · 09-27-2024

Hi,as you have many files, I have a suggestion do not use spark to read them in all at once as it will slow down greatly.instead use boto3 for the file listing, distribute the list across the cluster and again use boto3 to fetch the files and compact...

Databricks Community

User Stats

User Activity

Re: Cannot use RDD and cannot set "spark.databricks.pyspark.enablePy4JSecurity false" for

Re: Volume Limitations