Databricks Community

KVNARK · ‎03-03-2023

how to fetch the top 1L rows from a long SQL query as the query is returning 5gb data, I want to check only the first 1L ROWS.

pvignesh92 · ‎03-10-2023

@KVNARK . Hi, I could see this question is already answered. But this is something I found, Please check.

By default, Databricks SQL notebooks limit the number of rows displayed to 1,000. However, you can increase this limit to display more than 1 Lakh (100,000) rows by changing the configuration setting for spark.databricks.query.displayMaxRows

.

To change the configuration setting, follow these steps:

In the Databricks workspace, navigate to the cluster that you're using for your notebook.
Click on the "Advanced Options" tab.
In the "Spark" section, click on the "Edit" button next to "Spark Config".
Add the following configuration setting:
spark.databricks.query.displayMaxRows 10000 (or the desired maximum number of rows)
Click "Confirm" to save the configuration change.

Thanks,

Vignesh

View solution in original post

Anonymous · ‎03-03-2023

By 1L do you mean 100,000? You could do a limit in your SQL query.

youssefmrini · ‎03-03-2023

As Josephk stated you can use limit it's the easiest solution

SergeRielau · ‎03-03-2023

SELECT * FROM t LIMIT 10000;

IMIT clause in the docs

KVNARK · ‎03-05-2023

I'm not asking about a single table. My query is I have a long SQL query which will generate the out put as 5gb data. Out of which I need tolimit the rows only 1L.

select * from (my long sql query) A order by A.X limit 100000:

This is not working.

SergeRielau · ‎03-06-2023

Can you clarify "not working", please.

is there an error of sorts? Which one? OR is it simply "slow" and you expected it to be faster given the LIMIT clause?

KVNARK · ‎03-06-2023

Not getting any result if I give like above. But when given like below, I'm getting the output.

select * from (my long sql query) A

SergeRielau · ‎03-07-2023

You get an empty result set? May that should be a support ticket.

Cleary adding an ORDER BY may make things slower, but it should not change the resut set.

Can you try only ORDER BY without LIMIT? Just t see where the problem is..

Also try LIMIT without ORDER BY

pvignesh92 · ‎03-09-2023

Did you tried to see if you are getting output with less limit count.

Ex select * from (my long sql query) A order by A.X limit 100;

This can help understand if the underlying subquery is a problem or with the volume trying to display

KVNARK · ‎03-09-2023

@Serge Rielau both the ways I tried but its of no use. its giving 10000 rows max but note more than that.

pvignesh92 · ‎03-10-2023

@KVNARK . Hi, I could see this question is already answered. But this is something I found, Please check.

By default, Databricks SQL notebooks limit the number of rows displayed to 1,000. However, you can increase this limit to display more than 1 Lakh (100,000) rows by changing the configuration setting for spark.databricks.query.displayMaxRows

.

To change the configuration setting, follow these steps:

In the Databricks workspace, navigate to the cluster that you're using for your notebook.
Click on the "Advanced Options" tab.
In the "Spark" section, click on the "Edit" button next to "Spark Config".
Add the following configuration setting:
spark.databricks.query.displayMaxRows 10000 (or the desired maximum number of rows)
Click "Confirm" to save the configuration change.

Thanks,

Vignesh