Apologies in advance for the soft question, but I'm genuinely struggling with this.
We're a small data science unit just setting up in Databricks. While we do run some intensive ETL and analytics jobs, a non-trivial part of the team's BAU is exploratory desktop analytics. E.g. this might involve being sent spreadsheets by other organisations, or downloading random bits of data from the web to do ad hoc, small pieces of analytics in python or R.
What is the recommended way of organising and persisting files for such workflows? Using the DBFS file system to read and write from object storage seems like the obvious solution, but the Databricks documentation seems to be giving mixed messages on this. E.g. the following 2 articles from the docs (article1, article2) state pretty explicitly right up front that:
"Databricks recommends against using DBFS and mounted cloud object storage for most use cases in Unity Catalog-enabled Azure Databricks workspaces.
and
"Mounted data does not work with Unity Catalog, and Databricks recommends migrating away from using mounts and managing data governance with Unity Catalog".
So, what's best practice for such workflows?