cancel
Showing results for 
Search instead for 
Did you mean: 
Databricks Free Edition Help
Engage in discussions about the Databricks Free Edition within the Databricks Community. Share insights, tips, and best practices for getting started, troubleshooting issues, and maximizing the value of your trial experience to explore Databricks' capabilities effectively.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unity Catalog Error: PERMISSION_DENIED: Can not move tables across arclight catalogs (Free Edition)

Schusmeicer
New Contributor II

Subject: Unity Catalog Error: PERMISSION_DENIED: Can not move tables across arclight catalogs (Free Edition)
Body: Hi everyone,

I'm trying to set up a Spark Declarative Pipeline (SDP) using a streaming table on Databricks Free Edition, but I'm hitting a persistent initialization error during the pipeline setup.

The Error: UNITY_CATALOG_INITIALIZATION_FAILED: PERMISSION_DENIED: Can not move tables across arclight catalogs. SQL state: 56000

Context & Setup:

Environment: Databricks Free Edition (Community/Free tier).

Source Table: stream.stream_learning.source_data_stream (Streaming Table)

Target Table: stream.stream_learning.processed_data_ingest (Defined via SDP function decorator)

Cluster: 0225-015320-jpn6b927-v2n.

Both the source and the destination are within the same Catalog and Schema (stream.stream_learning).

It seems like the internal process that moves the data from the temporary/staging area to the final Unity Catalog destination is being flagged as a "cross-catalog move," even though everything is logically in the same namespace.

Has anyone encountered this "arclight catalogs" restriction on the Free Tier? Is there a specific configuration required for SDP when both source and sink are in Unity Catalog, or is this a known limitation of the Free Edition's UC implementation?

Any insights would be greatly appreciated!

Data Analyst | Python, PySpark & AWS | MBA em Data Science (USP/ Esalq) | Databricks & Infraestrutura de Dados
3 REPLIES 3

MoJaMa
Databricks Employee
Databricks Employee

Could you share the code you are using so I can try to reproduce?

Schusmeicer
New Contributor II
The Code:
from
pyspark import pipelines as dp
from pyspark.sql import functions
from pyspark.sql.functions import current_date
from pyspark.sql.types import StructType, StructField, StringType, IntegerType, FloatType, BooleanType, ArrayType

 

@dp.table()
def ingest():
  df = spark.read.table('stream.stream_learning.states_stream')
  df = df.withColumn('processado',current_date())

 

  return df
------------------------------------------------------------------------------------------------------------------------------
The error:

Category: Error
Message: Encountered an error with Unity Catalog while setting up the pipeline on cluster 0225-015320-jpn6b927-v2n.
Ensure that your Unity Catalog configuration is correct, and that required resources (e.g., catalog, schema) exist and are accessible.
Also verify that the cluster has appropriate permissions to access Unity Catalog.

Details: PERMISSION_DENIED: Can not move tables across arclight catalogs
Error class: UNITY_CATALOG_INITIALIZATION_FAILED
SQL state: 56000

---------------------------------------------------------------------------------------------------------
Data Analyst | Python, PySpark & AWS | MBA em Data Science (USP/ Esalq) | Databricks & Infraestrutura de Dados

SteveOstrowski
Databricks Employee
Databricks Employee

Hi @Schusmeicer,

The "Can not move tables across arclight catalogs" error you are seeing is specific to how Unity Catalog is managed on the Databricks Free Edition. "Arclight" is the internal infrastructure name for the Free Edition's managed Unity Catalog environment, and it enforces certain restrictions on how tables and pipeline artifacts can be created and moved within that environment.

When a Lakeflow Spark Declarative Pipeline (SDP) runs, it internally creates temporary staging tables and then moves them into the target catalog and schema. On the Free Edition, this internal move operation can be blocked because the managed catalog infrastructure treats the staging area and your target catalog as separate "arclight catalogs," even though from your perspective everything is in the same catalog and schema.

Here are a few things to check and try:


VERIFY YOUR PIPELINE CONFIGURATION

Make sure your pipeline's default catalog and schema are explicitly set to match where your source and target tables live:

1. Open your pipeline in the Databricks workspace UI.
2. Under the pipeline settings, confirm that the "Default catalog" is set to "stream" and the "Default schema" is set to "stream_learning".
3. Make sure the pipeline was created with Unity Catalog mode (not Hive metastore).

This ensures the pipeline's internal staging operations happen in the correct catalog context.


USE A FULLY QUALIFIED TABLE NAME IN THE DECORATOR

Instead of relying on the default catalog/schema resolution, try specifying the full name in your @DP.table() decorator:

from pyspark import pipelines as dp
from pyspark.sql.functions import current_date

@DP.table(name="stream.stream_learning.ingest")
def ingest():
df = spark.read.table("stream.stream_learning.states_stream")
df = df.withColumn("processado", current_date())
return df

This can help the pipeline engine resolve the correct destination without ambiguity during the internal staging process.


CONFIRM YOU ARE NOT EXCEEDING THE ONE-PIPELINE LIMIT

The Free Edition allows only one active pipeline per pipeline type. If you have another SDP pipeline that is already active (even in a failed or initializing state), that could cause conflicts. Go to the Pipelines section of your workspace, stop or delete any other active pipelines, and then retry.


TRY RECREATING THE PIPELINE FROM SCRATCH

If the above steps do not resolve it, try deleting the current pipeline entirely and creating a new one:

1. Delete the existing pipeline from the Pipelines UI.
2. Create a new pipeline.
3. Set the default catalog to "stream" and default schema to "stream_learning".
4. Attach your notebook with the SDP code.
5. Run the pipeline.

This can help clear any stale internal state that may be contributing to the cross-catalog error.


CHECK FOR EXISTING TABLE CONFLICTS

If a table named "ingest" (or whatever your @DP.table function name resolves to) already exists in the target schema and was not originally created by a pipeline, that can also trigger this type of error. You can check by running:

SHOW TABLES IN stream.stream_learning;

If a conflicting table exists, try either dropping it first or using a different name in your @DP.table() decorator.


ADDITIONAL NOTES

The Free Edition documentation lists some constraints relevant to pipelines:
https://docs.databricks.com/en/getting-started/free-edition-limitations.html

For more on configuring Lakeflow Spark Declarative Pipelines with Unity Catalog:
https://docs.databricks.com/en/delta-live-tables/unity-catalog.html
https://docs.databricks.com/en/delta-live-tables/configure-pipeline.html

If none of these steps resolve the issue, it may be worth opening a support request through the Help Center, as the error could relate to an internal state issue with your Free Edition workspace that requires backend attention.

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.