Monday
Hi Community,
How to read the CDF logs in materialized views created by DLT Pipeline?
Thanks for you time,
Monday - last edited Monday
@jeremy98 just asking to get clarity on the use case and provide a more accurate assistance.
In UC I believe that being these internal table's with ids, you would have to grant yourself access on the __dlt_materialization_schema_<pipeline_id>, please give it a try and let me know.
So, assuming your pipeline_id is "1234_6f26_454c_8e0d_12342134" and your usename username@domain, then:
%sql
GRANT use schema ON catalog __databricks_internal TO `username@domain`;
GRANT use catalog ON catalog __databricks_internal TO `username@domain`;
GRANT use schema ON SCHEMA __databricks_internal.__dlt_materialization_schema_1234_6f26_454c_8e0d_12342134 TO `username@domain`;
GRANT SELECT ON SCHEMA __databricks_internal.__dlt_materialization_schema_1234_6f26_454c_8e0d_12341234 TO `username@domain`;โ
And then I hope you should be able to query the __event_log:
%sql
select * from __databricks_internal.__dlt_materialization_schema_1234_6f26_454c_8e0d_12342134.`__event_log` where message like "%main%"
This access is typically granted temporarily and should be revoked after the debugging session to maintain security. So by default, users, including metastore admins, do not have access to these tables to maintain security and prevent accidental data corruption.
Please try and be very cautious in handling/granting access to it, and as much as possible preserve the access restricted.
Tuesday
@jeremy98 correct, If permissions management is complex, consider using standard Delta tables with CDF enabled and orchestrate changes through Databricks Workflows. This approach simplifies collaboration and avoids issues with restricted internal schema access.
Monday
Hi @jeremy98,
To read the Change Data Feed (CDF) logs in materialized views created by a Delta Live Tables (DLT) pipeline, you can follow these steps:
Enable Change Data Feed: Ensure that the change data feed is enabled on the base tables of the materialized views. You can enable it using the following SQL command:
ALTER TABLE table1 SET TBLPROPERTIES (delta.enableChangeDataFeed = true);
View the Refresh History: You can view the status of refresh operations on a materialized view, including current and past refreshes, by querying the Delta Live Tables event log. Use the following SQL query to retrieve the relevant information:
SELECT *
FROM event_log(TABLE(<fully-qualified-table-name>))
WHERE event_type = 'update_progress'
ORDER BY timestamp DESC;
Replace <fully-qualified-table-name> with the fully qualified name of the materialized view, including the catalog and schema.
Access the Event Log: The event log for a DLT pipeline can be accessed programmatically. Here is an example query to check if a DLT has run a maintenance task:
SELECT
timestamp,
event_type,
details:maintenance_progress.state,
message,
id,
origin.pipeline_name,
origin.cluster_id,
origin.maintenance_id
FROM delta.`dbfs:/pipelines/<pipeline-id>/system/events`
WHERE event_type = 'maintenance_progress';
Replace <pipeline-id> with the actual pipeline ID.
Monday
Hi Alberto,
Thanks for your answer, I got this error:
PERMISSION_DENIED: User does not have permission to access event logs of `catalog`.`02_gold_layer`.`example_table`
how to be much flexible? Because I'm not the owner of the table created
Monday
Hi @jeremy98,
Looks like you don't have access can you navigate through the catalog to ensure you have the right permissions assigned to catalog schema and table?
You should have these permissions:
GRANT SELECT ON TABLE `catalog`.`02_gold_layer`.`example_table` TO `your_user`;
GRANT USAGE ON SCHEMA `catalog`.`02_gold_layer` TO `your_user`;
GRANT USAGE ON CATALOG `catalog` TO `your_user`;
Monday
Hello,
Still I cannot use the event_log table
Monday
What error does it give you and have you ensure the required permissions were given? are you able to see the table while looking at data in the browser explorer?
Monday
Hello Alberto,
Yes, I can see the data in the browser explorer not the event_log table, the error is the one I submitted to you before. Yes, I gave the useful permissions
Monday
@jeremy98 is your use case involving DLT + Unity Catalog? If so, then I believe you'll need to grant yourself access to the internal schemas instead.
Monday - last edited Monday
Hello VZLA,
Yes, my catalog is inside UC! Isn't a best practice? I'm the owner who created the catalog, Is needed to assign a grant privileges?
PERMISSION_DENIED: User does not have permission to access event logs of <table>
Monday - last edited Monday
@jeremy98 just asking to get clarity on the use case and provide a more accurate assistance.
In UC I believe that being these internal table's with ids, you would have to grant yourself access on the __dlt_materialization_schema_<pipeline_id>, please give it a try and let me know.
So, assuming your pipeline_id is "1234_6f26_454c_8e0d_12342134" and your usename username@domain, then:
%sql
GRANT use schema ON catalog __databricks_internal TO `username@domain`;
GRANT use catalog ON catalog __databricks_internal TO `username@domain`;
GRANT use schema ON SCHEMA __databricks_internal.__dlt_materialization_schema_1234_6f26_454c_8e0d_12342134 TO `username@domain`;
GRANT SELECT ON SCHEMA __databricks_internal.__dlt_materialization_schema_1234_6f26_454c_8e0d_12341234 TO `username@domain`;โ
And then I hope you should be able to query the __event_log:
%sql
select * from __databricks_internal.__dlt_materialization_schema_1234_6f26_454c_8e0d_12342134.`__event_log` where message like "%main%"
This access is typically granted temporarily and should be revoked after the debugging session to maintain security. So by default, users, including metastore admins, do not have access to these tables to maintain security and prevent accidental data corruption.
Please try and be very cautious in handling/granting access to it, and as much as possible preserve the access restricted.
Monday
Hello @VZLA ,
Thanks for your clarification, very helpful to understand that my goal here is not easily applicable, also because this dlt pipeline and the related event_log tables need to be handle for multiple person in our team. We need to migrate data from gold layer to postgres db. But, without CDF enabled or basically event_log (access table) is more difficult. Although, I tried to grant me the access but I got:
PERMISSION_DENIED: User does not have MANAGE and USE CATALOG on Catalog '__databricks_internal'.
In your opinion it is better to migrate our idea to use only Delta Tables with CDF enabled and use simply workflows to manage the changes?
Tuesday
@jeremy98 correct, If permissions management is complex, consider using standard Delta tables with CDF enabled and orchestrate changes through Databricks Workflows. This approach simplifies collaboration and avoids issues with restricted internal schema access.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group