cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

RECURSION_ROW_LIMIT - how to increase more than 1M ?

saurabh_aher
New Contributor III

 

I have usecase where we requires rows more than 1M. buts recursion is limited to 1M. how to increase this limit in Recursive CTE ? 

saurabh_aher_0-1753944326907.png

saurabh_aher_1-1753944347987.png

 

 

8 REPLIES 8

szymon_dybczak
Esteemed Contributor III

Hi @saurabh_aher ,

According to documentation, this limit can be overridden if the SELECT statement driving the recursive CTE includes a LIMIT clause, which effectively controls the recursion. Did you try that?

saurabh_aher
New Contributor III

yes I tried that but its giving syntax error when its done.

saurabh_aher
New Contributor III

saurabh_aher_0-1753946256350.png

example of that. 

szymon_dybczak
Esteemed Contributor III

Hi @saurabh_aher ,

And when you try something like that?

with recursive cte as (
select 1 as n

union all 

select n+1 from cte where n<15

)
select *
from cte
limit 2000000

saurabh_aher
New Contributor III

Its not working then as well.

szymon_dybczak
Esteemed Contributor III

On my side below approach is working.

%sql
WITH RECURSIVE recursive_cte(n) MAX RECURSION LEVEL 200000 AS (
  SELECT 1
  UNION ALL
  SELECT n + 1 FROM recursive_cte WHERE n < 1050000
)
SELECT * FROM recursive_cte
LIMIT 1050000;

But still generating so many rows using recursive call is not the best idea (unless your use case is different and you paste your code just for the sake of example).

But if you want to generate huge number of rows you can take approach suggested by Itzik-Ben-Gan:

Number series generator challenge solutions โ€“ Part 1 - SQLPerformance.com

saurabh_aher
New Contributor III

 

I have a Databricks SQL table with the following columns:

id | name | managerId | rolename

This table contains hierarchical data for all employees in the organization, where each employee has an associated manager (except for the CEO, whose managerId is NULL).

I want to build a new table that flattens the hierarchy, so that for any given employee ID, I can retrieve all of their direct and indirect subordinates, along with each subordinate's level in the hierarchy

the code 

WITH RECURSIVE hierarchy AS (
-- Anchor: start with each employee as their own root (level 0)
SELECT
e.id AS root_id,
e.name AS root_name,
e.id AS emp_id,
e.name AS emp_name,
e.managerId,
0 AS level_id
FROM employees e

UNION ALL

-- Recursive part: find all subordinates
SELECT
h.root_id,
h.root_name,
e.id AS emp_id,
e.name AS emp_name,
e.managerId,
h.level_id + 1 AS level_id
FROM hierarchy h
JOIN employees e
ON e.managerId = h.emp_id
)
SELECT *
FROM hierarchy
ORDER BY root_id, level_id, emp_id;




 

szymon_dybczak
Esteemed Contributor III

Hi @saurabh_aher ,

Look at documentation example. You can't use order by with limit at the same time, because order by prevents early termination of recursion. Load the content of your cte with limit applied to delta table and then you can do whatever you want (I mean order the data according to your needs).

szymon_dybczak_0-1753960845563.png

WITH RECURSIVE hierarchy AS (
-- Anchor: start with each employee as their own root (level 0)
SELECT
e.id AS root_id,
e.name AS root_name,
e.id AS emp_id,
e.name AS emp_name,
e.managerId,
0 AS level_id
FROM employees e

UNION ALL

-- Recursive part: find all subordinates
SELECT
h.root_id,
h.root_name,
e.id AS emp_id,
e.name AS emp_name,
e.managerId,
h.level_id + 1 AS level_id
FROM hierarchy h
JOIN employees e
ON e.managerId = h.emp_id
)
SELECT *
FROM hierarchy
LIMIT 150000 /*Put here your number*/