- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2024 02:23 AM
I need a DLT pipeline to create a materialized view for fetching event logs. All the ways below I tried are failed:
- Attach a notebook with pure SQL inside: No magic cell like `%sql` are failed
- Attach a notebook with `spark.sql` python code: Failed because I cannot use the `CREATE VIEW` statement.
- Attach a file with `.sql` format: Failed because sql format is not supported yet.
I follow the instructions here to query event log: https://docs.databricks.com/en/delta-live-tables/observability.html#monitor-delta-live-tables-pipeli...
How can I achieve this with the DLT pipeline?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2024 08:05 AM
Hi @guangyi ,
As @szymon_dybczak informed, .py and .sql formats are supported. The 2 first options you mentioned are not supported: you cannot use SQL in .py files, but SQL commands supported when .sql file is used.
Check the docs:
https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/tutorial-pipelines#sql-example
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2024 02:39 AM
Hi @guangyi ,
It should be sufficient to attach notebook with sql language set as a default. What error did you get?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2024 08:05 AM
Hi @guangyi ,
As @szymon_dybczak informed, .py and .sql formats are supported. The 2 first options you mentioned are not supported: you cannot use SQL in .py files, but SQL commands supported when .sql file is used.
Check the docs:
https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/tutorial-pipelines#sql-example
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2024 06:34 PM
Hi @szymon_dybczak @filipniziol
Here is how I implement the 3rd option and how it failed:
I create a JSON file with the DLT pipeline definition inside
{
"name": "query_data_quality_event_log_pipeline",
"clusters": [
{
"label": "default",
"spark_conf": {
"spark.databricks.acl.needAdminPermissionToViewLogs": "false"
},
"policy_id": "xxxxxx",
"autoscale": {
"min_workers": 1,
"max_workers": 2,
"mode": "ENHANCED"
}
},
{
"label": "maintenance",
"policy_id": "xxxxxx"
}
],
"development": true,
"continuous": false,
"channel": "PREVIEW",
"edition": "CORE",
"catalog": "xxxxxx",
"target": "xxxxxx",
"libraries": [
{
"notebook": {
"path": "/Workspace/Users/xxx@xxx/query_data_quality_event_log.sql"
}
}
]
}
Then create the pipeline via Databricks CLI
databricks pipelines create --json "$(cat single-dlt.json)" -p PID
The pipeline can be created successfully. However, when I clicked the running button, It show me this result:
BAD_REQUEST: Failed to load notebook '/Workspace/Users/xxx@xxx/query_data_quality_event_log.sql'. Only SQL and Python notebooks are supported currently.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2024 07:00 PM
After just finishing my last reply, I realized what’s wrong with my code: I should use “file” property instead of “notebook” in the libraries section.
It works now. Thank you guys, you are my rubber duck!

