Databricks

User16783853501 · ‎06-23-2021

using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?

aladda · ‎06-23-2021

Can you clarify what are you looking for and what your use case is? Are you asking whether there's a preference for using Spark SQL or just direct SQL with %sql or something else?

sajith_appukutt · ‎06-23-2021

There is no offset support yet. Here are a few possible workarounds

If you data is all in one partition ( rarely the case 🙂 ) , you could create a column with monotonically_increasing_id and apply filter conditions. if there are multiple partitions, monotonically_increasing_id won't be consecutive

Use except ( in your case sql equivalent of code below) . This however would be an expensive operation

df1 = df.limit(10); 
df2 = df.except(df1); 
df2.limit(20);

Databricks

using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?

How to successfully build GenAI applications

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Register now and save 50% on training at Data + AI Summit!