cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Does Lakeflow Connect Have Any Change Tracking Diagnostics?

cvh
New Contributor II

We have set up Change Tracking on multiple SQL Servers for Lakeflow Connect successfully in the past, but lately we are having lots of problems with a couple of servers. The latest utility script has been run and both lakeflowSetupChangeTracking and lakeflowFixPermissions have been executed several times. The database trigger [lakeflowDdlAuditTrigger_1_5] is enabled, CT has been enabled on the database (checked sys.change_tracking_databases) and on the tables we want to ingest (checked sys.change_tracking_tables). We can see the ddl audit table (dbo.[lakeflowDdlAudit_1_5]) is also present, but when we try to run the ingestion pipeline we get ...

INGESTION_GATEWAY_DDL_OBJECTS_MISSING

DDL objects missing on table '<TABLE>'. Execute the DDL objects script and full refresh the table on the Ingestion Pipeline

Reason: - Catalog is not properly configured to capture DDL changes for the provided set of tables. How to fix: - To capture DDL changes for a table with LakeFlow Connect, appropriate set of support objects must be setup on the catalog. Different support objects are needed depending if tables that are replicated are using CT and/or CDC extraction mechanism. - Investigation shows that only CT tables are included in the replication, but corresponding DDL support object do not exist in the catalog.

We also found this in the ingestion gateway error logging, but it still doesn't identify specifically where the problem is:

ddlCaptureNotEnabledTables={<TABLE>= Reason: - Catalog is not properly configured to capture DDL changes for the provided set of tables. How to fix: - To capture DDL changes for a table with LakeFlow Connect, appropriate set of support objects must be setup on the catalog. Different support objects are needed depending if tables that are replicated are using CT and/or CDC extraction mechanism. - Investigation shows that only CT tables are included in the replication, but corresponding DDL support object do not exist in the catalog.

Is there anything that we can run on the SQL Server to give us more information about the specific problem? How does Lakeflow Connect determine that DDL has not been configured? When the documentation has been followed - and yet we still get this error - where to from here? 

4 REPLIES 4

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @cvh,

Having done some research, I have found that for SQL Server CT, Lakeflow expects the following (per DB/catalog)..

  • Utility objects at version 1.5: dbo.lakeflowUtilityVersion_1_5() = '1.5'.
  • CT DDL objects present in the same database as the CT tables:
    • Table: dbo.lakeflowDdlAudit_1_5
    • Database DDL trigger: lakeflowDdlAuditTrigger_1_5 (enabled)
  • The ingestion user can read from dbo.lakeflowDdlAudit_1_5.
  • The DDL trigger is actually writing rows for CT tables (i.e., DDL is being captured).

If any of these fail, the gateway reports INGESTION_GATEWAY_DDL_OBJECTS_MISSING and logs ddlCaptureNotEnabledTables.

Does your setup meet all the above requirements? Do you know how to check these, or do you need help with that?

If this answer resolves your question, could you mark it as โ€œAccept as Solutionโ€? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

amirabedhiafi
New Contributor II

I think you need to check 2 more things that are easy to miss.

The ingestion user must have all required metadata permissions, not just access to the source tables, including sys.change_tracking_tables, sys.change_tracking_databases, sys.objects, sys.triggers, and VIEW DATABASE PERFORMANCE STATE

And the DDL audit trigger must actually be writing rows into dbo.lakeflowDdlAudit_1_5 for CT tables. Databricks validates DDL support, CT/CDC state, utility objects, and permissions separately, so object existence alone is not sufficient.

If you upgraded the utility script you need to verify that the gateway was stopped first and the setup procedure was rerun with the same parameters as the original config.

Once you fix the issue don't forger to make full refresh for the affected table.

If this answer resolves your question, could you please mark it as โ€œAccept as Solutionโ€? It will help other users quickly find the correct fix.

Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP

cvh
New Contributor II

Thanks @Ashwin_DSA , @amirabedhiafi for your swift responses.

I had high hopes when I saw lakeflowUtilityVersion_1_5() is queried, as I found the database user for the ingestion gateway connection (i.e. the @User parameter for both dbo.lakeflowSetupChangeTracking and dbo.lakeflowFixPermissions) DIDN'T have execute permission on that function (clearly an oversight in lakeflowFixPermissions which Databricks ought to address).

But after granting execute permission we are still getting exactly the same error.

And our setup meets all the other requirements you have mentioned between you.

The one thing that you both mention that doesn't make sense to me is the need for the trigger to "actually be writing rows" to the audit table. I realise that for DDL changes to be picked up this needs to be operational, but surely this can't be necessary for the ingestion to work at all. Isn't it perfectly normal/acceptable for the audit table to be empty? We are attempting to ingest a stable source that hasn't experienced any ALTER TABLE operations yet so the trigger won't have fired. Or am I missing something?

amirabedhiafi
New Contributor II

Hi  !

Now I can better understand you and you are not missing anything there.

An empty dbo.lakeflowDdlAudit_1_5 table should be perfectly normal on a stable source that has not had any ALTER TABLE activity yet. In DBKS for CT the DDL support objects are the audit table plus the database DDL trigger and that the triggerโ€™ role is to capture ALTER TABLE events. That implies the table is a history table for schema changes not a table that must be prepopulated for ingestion to start. Well I couldn't find anything in the current docs saying the audit table must already contain rows before a pipeline can run.

So I would correct that earlier point and to resume the required is the CT DDL support objects exist and are correctly configured and not inherently required is that the audit table already has entries.

Also the fact that lakeflowUtilityVersion_1_5() lacked execute permission is where I think it explains part of the situation but I would not expect that alone to explain this exact error because DBKS has a separate insufficient permissions error class for source side permission problems. If the gateway were failing specifically because it could not execute an object, I would expect something closer to INGESTION_GATEWAY_SOURCE_INSUFFICIENT_PERMISSION_FAILURE not INGESTION_GATEWAY_DDL_OBJECTS_MISSING.

What I think is more likely is one of these:

If this answer resolves your question, could you please mark it as โ€œAccept as Solutionโ€? It will help other users quickly find the correct fix.

Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP