11-20-2024 06:27 AM
Hello,
I have encountered an issue recently and was not able to find a solution yet.
I have a job on databricks that creates a table using dbt (dbt-databricks>=1.0.0,<2.0.0). I am setting the location_root configuration so that this table is external and not managed. (further dbt config: +materialized: table)
Now I want to apply a row filter on this table, I created the function, I run the
11-21-2024 07:06 AM
To recreate the issue:
PS. Good to know: using dbt to create materialized tables is equivalent to running "create or replace table table_name"
create or replace table table_name using delta location 'path/to/table' with row filter func_name on (column)create or replace table table_name using delta with row filter func_name on (column)
But the table storage location and table type (external) will remain the same....
This still doesn't make sense to me, but I found a work around for this:
Since I am using dbt, I simply added a project variable row_filter, defaulting to false.
I set this variable to true, in the Databricks asset bundle task : dbt run ..... --vars '{row_filter: true, ...}' this way the variable can be set for each table seperately
Then I add the following jinja code at the beginning of my dbt models/xxx.sql model:
{% if var('row_filter', false) | string | lower == 'false' %}
{{ config(
location_root=set_location_root(parameter1, parameter2...)
) }}
{% endif %}
* set_location_root() is a self developed macro
The work around basically sets the location root if there is no row filtering happening, otherwise it will skip it.
This will not impact the tables as they were already created within the right location previously to applying the row filter.
I would still appreciate receiving feedback and a better solution to this issue 🙂
Thanks!
05-15-2025 02:04 AM
Hi, did ever find out or got a resolution on the row_filter and external table throwing an error. We have a similar case:
the table is external and row filter is applied, we have a statement to create the table if not exists with the location. It will throw an error if the table already exists, and the table is thus not created.
3 weeks ago
This issue occurs because Databricks does not support applying row filters or column masks to external tables when path-based access is used. While you are able to set the row filter policy on your table with no immediate error, the limitation only becomes apparent when an operation such as recreating the table via DBT (which relies on path-based access due to the external table configuration) is attempted. This triggers the error:
INVALID_PARAMETER_VALUE.PATH_BASED_ACCESS_NOT_SUPPORTED_FOR_TABLES_WITH_ROW_COLUMN_ACCESS_POLICIES
Row filters/column masks introduce additional metadata and security management that Databricks expects to be fully controlled. Managed tables provide this capability because the system governs both storage and access, allowing Databricks to enforce row/column-level policies reliably.
External tables, in contrast, are defined with a location_root and rely on underlying storage path-based access. When you apply a row filter to such tables, Databricks cannot guarantee enforcement using only storage-level permissions, leading to a conflict.
The actual creation of the row filter succeeds because Databricks allows attaching access policies to any table initially, but enforcement (and validation) only occurs when the table is accessed in a manner that checks all associated policies, such as reading, recreating, or querying via DBT.
Setting the policy updates table metadata but does not test full compatibility with future access patterns.
Running the DBT job typically triggers a table recreation or re-write, invoking deeper compatibility checks. If path-based access is required (external tables), and a row filter is present, Databricks blocks the operation with the above error.
Here are your main options:
Switch from an external to a managed table so Databricks can safely enforce row filters and column masks.
Pros: Fully supported by Databricks; access policies work seamlessly.
Cons: You lose some storage flexibility provided by external tables.
Drop the row filter or column mask on the table if you must use path-based access.
Pros: Allows you to use external tables and DBT normally.
Cons: You lose fine-grained access control at the row/column level.
Create views on top of your external tables to implement row-level security logic as part of the view definition.
This avoids putting access policies directly on the table.
Pros: Keeps the external table unchanged and applies row filtering through views.
Cons: Slightly less elegant and policies are not enforced at the table metadata level.
If you need more control over storage, consider using managed tables with external locations in Unity Catalog (if available). This blends the benefits of managed enforcement with external storage options.
Apply row-level security in your data pipelines or application logic rather than relying on Databricks metadata policies.
Pros: Maximal compatibility.
Cons: More operational overhead and possible security risks.
| Option | Row Filtering Supported | External Table Allowed | Security Metadata Enforced | Notes |
|---|---|---|---|---|
| Managed Table | Yes | No | Yes | Recommended for policies |
| External Table (no policy) | No | Yes | No | DBT works, no fine-grained policies |
| External Table + Views | Partial | Yes | No | Row filtering is handled in SQL, not metadata |
| Unity Catalog Managed | Yes | Yes (with locations) | Yes | Advanced, may require premium features |
The error is by design: Databricks prevents path-based access when row/column access policies are set to avoid security loopholes.
For DBT workflows requiring external tables and row/column policies, consider replacing row-access policies with SQL views or restructuring your workflow to use managed tables where possible.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now