cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Apply row_num in DLT

dbhavesh
New Contributor II

Hi all,

how to use row_num in DLT or What is the alternative for row_num function in DLT.

We are looking for same functionality which row num is doing.

 

Thanks in advance.

3 REPLIES 3

Takuya-Omi
Valued Contributor II

@dbhavesh 

In DLT, you can achieve similar functionality to the ROW_NUMBER() function in SQL by using the ROW_NUMBER() window function within your DLT pipeline. This can be done using PySpark or SQL syntax within your DLT pipeline code.

 

CREATE MATERIALIZED VIEW bronze_dlt AS
SELECT
  *,
  ROW_NUMBER() OVER (ORDER BY column1) AS row_number
FROM
  test_wk.default.source_table
--------------------------
Takuya Omi (尾美拓哉)

dbhavesh
New Contributor II

Hi TakuyaOmi, thanks for your response.

I did try that out, but receiving this kind of error as shown in the image below:

dbhavesh_0-1739286611620.png

Please let me know your thoughts.

 

Thanks in advance!

Takuya-Omi
Valued Contributor II

@dbhavesh 

I apologize for the lack of explanation.

The ROW_NUMBER function requires ordering over the entire dataset, making it a non-time-based window function. When applied to streaming data, it results in the "NON_TIME_WINDOW_NOT_SUPPORTED_IN_STREAMING" error.

This issue occurs specifically in DLT streaming tables because they continuously process incoming data. However, in the case of materialized views, data is processed as a snapshot at a given point in time, allowing ordering without triggering this error.

If you need to generate sequential numbers, consider either:

  1. Using a materialized view instead of a streaming table, or
  2. Defining an IDENTITY column in the table schema, which automatically assigns unique sequential numbers upon data insertion.*

Databricks Documentation – Identity Columns in Delta Lake

--------------------------
Takuya Omi (尾美拓哉)

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now