โ04-05-2025 12:54 PM
Hi Guys,
We're working on the monetization product and are trying to understand how much costs are coming from our jobs and DLT and all purpose interactive sessions? and are currently exploring the system.lakeflow.job_task_run_timeline table to find till all the task level details.
Is there a way to figure out based on this table to understand which is DLT records and which are from jobs.
Thanks for your help.
โ04-05-2025 04:38 PM
Hi ankit001mittal,
How are you doing today?, As per my understanding, you're on the right path by using the system.lakeflow.job_task_run_timeline table to track job and task-level activity. While this table gives you rich info on task runs, it doesn't directly label whether a task is from DLT, regular jobs, or interactive sessions. However, you can usually identify DLT runs by checking the job_name or task_name fieldsโDLT pipelines often include the word "DLT" or the pipeline name itself, and the job_cluster_type may show DLT or PIPELINE in some cases. You can also cross-reference with the system.lakehouse.pipeline_events table (if available in your workspace), which contains DLT-specific activity. For interactive sessions (like notebooks), they typically wonโt show up as scheduled tasks unless launched through jobsโso their cost tracking may need to be done via system.billing.usage or workspace-level cost reports. So yes, while it's not a direct flag, you can filter and classify based on naming patterns and job types. Let me know if you'd like help writing a sample query to separate them out!
Regards,
Brahma
โ04-06-2025 03:56 AM
Hi @Brahmareddy ,
Thanks for your explanation. Unfortunately, I can't see the pipeline_events table in my workspace, is there any specific reason behind why it's not available in my workspace? is it region specific?
Also, Is it mandatory that DLT word would be added for all DLT specific jobs and tasks?
โ04-06-2025 06:09 AM
Hi Ankit, Thanks for your follow-up. The pipeline_events table you're referring to is part of the system tables for Delta Live Tables (DLT), and if itโs not showing up in your workspace, itโs likely because system tables haven't been enabled yet in your environment. Theyโre not region-restricted, but they do need to be explicitly turned on in each workspace through the admin settings or API. You can check with your workspace admin to enable system tables if you're not seeing them.
As for the namingโno, itโs not guaranteed that all DLT jobs or tasks will have โDLTโ in the name. That depends on how the pipeline was named during creation. However, many teams do include "DLT" or some identifier in the job name or task names for clarity, so if your team follows that convention, it can help you filter them. If not, you might need to identify DLT jobs by their structureโfor example, they often run on PIPELINE-type clusters and might show patterns in the job_cluster_type or job_type fields. Let me know if you'd like help writing a query to narrow them down based on those hints!
Regards,
Brahma
โ04-06-2025 06:22 AM
Hi @Brahmareddy ,
Could you please provide any documentation link for the pipeline_events table? as I couldn't find any databricks page where they've explained about this table and how to enable it.
@Brahmareddy wrote:Hi Ankit, Thanks for your follow-up. The pipeline_events table you're referring to is part of the system tables for Delta Live Tables (DLT), and if itโs not showing up in your workspace, itโs likely because system tables haven't been enabled yet in your environment. Theyโre not region-restricted, but they do need to be explicitly turned on in each workspace through the admin settings or API. You can check with your workspace admin to enable system tables if you're not seeing them.
โ04-06-2025 06:29 AM
Hi ankit001mittal,
Sure! The pipeline_events table is part of Databricksโ system tables for Delta Live Tables (DLT), but unfortunately, thereโs not much official documentation about it yet. To use it, you first need to enable system tables in your DLT pipeline settingsโjust go to your pipeline in the Databricks UI, click on "Edit", and under the "Advanced" section, toggle on "Enable system table access". Once thatโs turned on and your pipeline runs, you should be able to query tables like system.lakehouse.pipeline_events. If you donโt see the option, it could be due to workspace settings, region limitations, or account permissionsโso it might be worth checking with your Databricks admin. Let me know if you want a sample query or other ways to track pipeline activity in the meantime!
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now