Attending the Databricks Data AI Summit to learn all things Databricks, Spark, migrating from Hadoop to Delta Lake, Delta Live tables, and Generative AI
You can set this per notebook or workspace or even at the compute level.spark.databricks.sql.initial.catalog.name mycatalogIf you add this to your cluster's spark config, all tools that run using that cluster will default to that catalog.
Might be easier to use curl commnad .. in a notebook you can run as shell command or python to first load the file into local driver temp storage%sh curl https://url.com/file.pdf --output /tmp/file.pdf or in pythonimport urllib
urllib.request.urlretr...
And this is the setting to turn on auto optimization Optimized writes can be enabled at the table or session level using the following settings:Table setting: delta.autoOptimize.optimizeWriteSparkSession setting: spark.databricks.delta.optimizeWrite....
You can also turn on auto optimize writes for your table or session. This is especially good if your table is partitioned. You should also turn on auto compaction to help control the size of your files.Auto compaction can be enabled at the table or...