I see that Delta Lake has an OPTIMIZE command and also table properties for Auto Optimize. What are the differences between these and when should I use one over the other?
I am running jobs on Databricks using the Run Submit API with Airflow. I have noticed that rarely, a particular run is run more than one time at once. Why?
Hi @lauraxyz here is an example using the Databricks SDK in Python:
from databricks.sdk import WorkspaceClient
ws = WorkspaceClient()
image_path = '/Volumes/catalog/schema/volume/filename.jpg'
image_data = (
ws.files.download(image_path) # down...
Hello,
1.Yes you can pause the job, delete the cluster, upgrade versions of the cluster, etc. With Auto Loader and Structured Streaming the important thing is making sure that the checkpointLocation stays in tact, so no deletions, modifications, or m...