using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 02:49 PM
using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?
Labels:
- Labels:
-
Databricks notebook
-
Spark sql
2 REPLIES 2
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 04:09 PM
Can you clarify what are you looking for and what your use case is? Are you asking whether there's a preference for using Spark SQL or just direct SQL with %sql or something else?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 08:54 PM
There is no offset support yet. Here are a few possible workarounds
- If you data is all in one partition ( rarely the case 🙂 ) , you could create a column with monotonically_increasing_id and apply filter conditions. if there are multiple partitions, monotonically_increasing_id won't be consecutive
- Use except ( in your case sql equivalent of code below) . This however would be an expensive operation
df1 = df.limit(10);
df2 = df.except(df1);
df2.limit(20);

