How to Apply row_num in DLT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-10-2025 09:49 PM
Hi all,
how to use row_num in DLT or What is the alternative for row_num function in DLT.
We are looking for same functionality which row num is doing.
Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2025 12:00 AM
In DLT, you can achieve similar functionality to the ROW_NUMBER() function in SQL by using the ROW_NUMBER() window function within your DLT pipeline. This can be done using PySpark or SQL syntax within your DLT pipeline code.
Takuya Omi (尾美拓哉)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2025 07:10 AM
Hi TakuyaOmi, thanks for your response.
I did try that out, but receiving this kind of error as shown in the image below:
Please let me know your thoughts.
Thanks in advance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-15-2025 09:57 AM
I apologize for the lack of explanation.
The ROW_NUMBER function requires ordering over the entire dataset, making it a non-time-based window function. When applied to streaming data, it results in the "NON_TIME_WINDOW_NOT_SUPPORTED_IN_STREAMING" error.
This issue occurs specifically in DLT streaming tables because they continuously process incoming data. However, in the case of materialized views, data is processed as a snapshot at a given point in time, allowing ordering without triggering this error.
If you need to generate sequential numbers, consider either:
- Using a materialized view instead of a streaming table, or
- Defining an IDENTITY column in the table schema, which automatically assigns unique sequential numbers upon data insertion.*
* Databricks Documentation – Identity Columns in Delta Lake
Takuya Omi (尾美拓哉)

