mark_ott
Databricks Employee
Databricks Employee

Here's some solutions without using DBFS.. 

Yes, there are solutions for using the Spark scheduler allocation file on Databricks without DBFS, but options are limited and depend on your environment and access controls.

Alternatives to DBFS for Scheduler Files

1. Local Driver or Worker Node Files

  • You can place the fairscheduler.xml directly on each cluster node’s local filesystem (e.g., /databricks/driver/init/fairscheduler.xml).

  • Use an init script to distribute the file to these locations at cluster startup.

  • Set your Spark config as:

    text
    spark.scheduler.allocation.file: file:/databricks/driver/init/fairscheduler.xml

    This method is only reliable if the file placement is consistent across nodes and managed by an init script.​

2. Classpath Inclusion

  • Package fairscheduler.xml in your application’s JAR, and reference it via classpath:

    text
    spark.scheduler.allocation.file: fairscheduler.xml

    If it’s present on the classpath, Spark can pick it up. However, this does not work reliably in all Databricks environments, as cluster packaging and bind mounting policies may vary.​

3. Workspace Filesystem with Disabled WSFS

  • If you are using workspace files (e.g., under /Workspace/init/), disabling WSFS (WSFS_ENABLE=false) may allow fallback to classic filesystem access.

  • However, this often does not resolve the error on newer runtime unless you still ensure your file is locally accessible to all cluster nodes (either by classpath or node path). There are reports from community users that disabling WSFS may not always have the intended effect.​

4. Unity Catalog Volumes

  • If you have access to Unity Catalog, you may be able to store files as Unity Catalog volumes and reference them for Spark, but this requires setup and may have accessibility limitations for non-tabular files.​

What Doesn't Work Reliably

  • Direct Workspace paths (e.g., file:/Workspace/init/fairscheduler.xml) tend to fail without proper WSFS credential setup and cluster config, which is the error you are seeing.​

  • Disabling WSFS on some clusters does not force fallback to workspace files, especially on DBR 16.x and newer Databricks versions.​

View solution in original post