Databricks Community

cgrant · 06-23-2021

If I run OPTIMIZE on a Delta Lake table, will it prevent me from time travelling to a version before OPTIMIZE was run?

cgrant · 06-23-2021

I see that Delta Lake has an OPTIMIZE command and also table properties for Auto Optimize. What are the differences between these and when should I use one over the other?

cgrant · 06-09-2021

I'm trying to use the native execution engine, Photon. How can I tell if a query is using Photon or is falling back to the non-native Spark engine?

cgrant · 06-08-2021

I am running jobs on Databricks using the Run Submit API with Airflow. I have noticed that rarely, a particular run is run more than one time at once. Why?

cgrant · 08-09-2025

Here is an example for uploading files to a Volume, a download example is there, too

cgrant · 08-09-2025

That error is usually related to driver load. Try upsizing the driver one size and see if it still happens. Otherwise, for troubleshooting, driver problems are surfaced to the cluster's event log, like DRIVER_NOT_RESPONDING and DRIVER_UNAVAILABLE. Yo...

cgrant · 08-08-2025

This looks like a misconfigured Query Watchdog, specifically the below config: spark.conf.get("spark.databricks.queryWatchdog.outputRatioThreshold") Please check the value of this config - it is 1000 by default. Also, we recommend using Jobs Comput...

cgrant · 08-08-2025

Auto Loader is for loading raw files, not loading Delta Lake or Apache Iceberg tables, see more here. Instead, stream from a Delta Lake table.

cgrant · 08-08-2025

I don't have full access to that article, but here's something that might help clarify things! While Spark uses lazy evaluation (meaning it waits to execute until absolutely necessary), Python works with eager evaluation. This means that when you ru...

Databricks Community

User Stats

User Activity

Does running OPTIMIZE on a delta table destroy the transaction history of table?

What is the difference between OPTIMIZE and Auto Optimize?

How do I know how much of a query/job used Photon?

How to ensure that a Databricks Run Submit run invoked from Airflow only runs one time?

Re: importing files from streamlit app on databricks to dbfs

Re: Driver terminated abnormally due to FORCE_KILL

Re: Data load issue

Re: Considering Autoloader for Bronze to Silver transformations

Re: PySpark Lazy Evaluation