- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-03-2023 03:03 AM
how to fetch the top 1L rows from a long SQL query as the query is returning 5gb data, I want to check only the first 1L ROWS.
- Labels:
-
Databricks SQL
-
SQL
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-10-2023 12:54 AM
@KVNARK . Hi, I could see this question is already answered. But this is something I found, Please check.
By default, Databricks SQL notebooks limit the number of rows displayed to 1,000. However, you can increase this limit to display more than 1 Lakh (100,000) rows by changing the configuration setting for spark.databricks.query.displayMaxRows
.
To change the configuration setting, follow these steps:
- In the Databricks workspace, navigate to the cluster that you're using for your notebook.
- Click on the "Advanced Options" tab.
- In the "Spark" section, click on the "Edit" button next to "Spark Config".
- Add the following configuration setting:
- spark.databricks.query.displayMaxRows 10000 (or the desired maximum number of rows)
- Click "Confirm" to save the configuration change.
Thanks,
Vignesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-03-2023 04:17 AM
By 1L do you mean 100,000? You could do a limit in your SQL query.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-03-2023 05:21 AM
As Josephk stated you can use limit it's the easiest solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-03-2023 07:12 AM
SELECT * FROM t LIMIT 10000;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2023 08:19 PM
I'm not asking about a single table. My query is I have a long SQL query which will generate the out put as 5gb data. Out of which I need tolimit the rows only 1L.
select * from (my long sql query) A order by A.X limit 100000:
This is not working.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2023 06:51 AM
Can you clarify "not working", please.
is there an error of sorts? Which one? OR is it simply "slow" and you expected it to be faster given the LIMIT clause?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2023 09:29 PM
Not getting any result if I give like above. But when given like below, I'm getting the output.
select * from (my long sql query) A
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2023 11:32 AM
You get an empty result set? May that should be a support ticket.
Cleary adding an ORDER BY may make things slower, but it should not change the resut set.
Can you try only ORDER BY without LIMIT? Just t see where the problem is..
Also try LIMIT without ORDER BY
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2023 02:03 AM
Did you tried to see if you are getting output with less limit count.
Ex select * from (my long sql query) A order by A.X limit 100;
This can help understand if the underlying subquery is a problem or with the volume trying to display
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2023 09:05 PM
@Serge Rielau both the ways I tried but its of no use. its giving 10000 rows max but note more than that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-10-2023 12:54 AM
@KVNARK . Hi, I could see this question is already answered. But this is something I found, Please check.
By default, Databricks SQL notebooks limit the number of rows displayed to 1,000. However, you can increase this limit to display more than 1 Lakh (100,000) rows by changing the configuration setting for spark.databricks.query.displayMaxRows
.
To change the configuration setting, follow these steps:
- In the Databricks workspace, navigate to the cluster that you're using for your notebook.
- Click on the "Advanced Options" tab.
- In the "Spark" section, click on the "Edit" button next to "Spark Config".
- Add the following configuration setting:
- spark.databricks.query.displayMaxRows 10000 (or the desired maximum number of rows)
- Click "Confirm" to save the configuration change.
Thanks,
Vignesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-10-2023 02:52 AM
@Vigneshraja Palaniraj - Perfect. Thanks a lot!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-11-2023 07:36 PM
use the below commands
df.show(100000)

