Unity Catalog Volume as spark checkpoint location
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-24-2023 06:05 AM
Hi,
I tried to set the spark checkpoint location in a notebook to a folder in a Unity Catalog Volume, with the following command:
sc.setCheckpointDir("/Volumes/catalog_name/schema_name/volume_name/folder_name")
Unfortunately I receive the following error: "Py4JJavaError: An error occurred while calling o356.setCheckpointDir. : java.io.IOException: Operation not permitted".
My user have all privileges granted on the volume.
Did anyone face the same issue? Is it possible to use Databricks volumes as storage location for checkpoints?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-19-2024 04:22 PM
I am facing the same issue on DBR 14.3 and the beta of 15.4.
My cluster is using the "Unrestricted" policy and "Single user" access mode set a user which has permission to read and write to the volume. I tested the permissions by writing a small dataframe to my desired checkpoint folder directly (with .write instead of .setCheckpointDir followed by .checkpoint) and did not get the error. The exception is only raised when setting the volume as Spark's checkpoint directory.
Here is a bit more of the stack trace when calling .setCheckpointDir on a Unity catalog volume.
java.io.IOException: Operation not permitted
at java.io.UnixFileSystem.canonicalize0(Native Method)
at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:177)
at java.io.File.getCanonicalPath(File.java:626)
at java.io.File.getCanonicalFile(File.java:651)
at org.apache.spark.util.SparkFileUtils.resolveURI(SparkFileUtils.scala:49)
at org.apache.spark.util.SparkFileUtils.resolveURI$(SparkFileUtils.scala:33)
at org.apache.spark.util.Utils$.resolveURI(Utils.scala:105)
...
What storage solution is recommended for setting the cluster checkpoint directory in Databricks, if not Unity volumes?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-19-2025 07:35 PM
Further to this, it also seems that it is not possible to set a checkpoint directory on an external location where the principal has write permission to the external location.
When we try:
spark.sparkContext.setCheckpointDir("s3://bucket/path")
we see:
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied;
(I know its not a permissions issue, because I can read and write dataframes to the same path on the same UC cluster).
We've also tried setting the checkpoint directory through the spark configs like this:
spark.conf.set("spark.checkpoint.dir", "s3://bucket/path")
But we get:
[CANNOT_MODIFY_CONFIG] Cannot modify the value of the Spark config: "spark.checkpoint.dir".
See also 'https://spark.apache.org/docs/latest/sql-migration-guide.html#ddl-statements'. SQLSTATE: 46110
File <command-5849427671817506>, line 1
Both attempted on DBR 15.4, dedidated cluster.
I am shocked. Is it not possible to use checkpoints on UC???? There must be something I am overlooking.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
4 weeks ago - last edited 4 weeks ago
Did you get any solution for the above issue? I am also trying same in DBR 15.4, Standard cluster .So I am able to set checkpoint directory using below commands.
spark.conf.set("pyspark.sql.DataFrame.checkpoint", "/Volumes/path/")
spark.conf.set("spark.sql.checkpoint.dir","/Volumes/path/")
spark.conf.set("spark.sql.checkpointLocation","/Volumes/path/")
But its failing while checkpointing a dataframe.
df.checkpoint(True)
-->Checkpoint directory has not been set in the SparkContext
File <command-8168308127814448>, line 1
----> 1 df.checkpoint(True)
File /databricks/spark/python/pyspark/sql/connect/client/core.py:2149, in SparkConnectClient._handle_rpc_error(self, rpc_error)
2134 raise Exception(
2135 "Python versions in the Spark Connect client and server are different. "
2136 "To execute user-defined functions, client and server should have the "
In this case can we use localCheckoint as an alternative but I know localCheckoint are not reliable as Local checkpoints are stored in the executors using the caching subsystem.
Is it not really possible to use checkpoints on UC enabled cluster with DBR 15.4 Or is there any new way to use checkpoint on dataframe .

