topic Re: using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ? in Data Engineering

using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?

User16783853501 — Wed, 23 Jun 2021 21:49:16 GMT

using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?

Re: using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?

aladda — Wed, 23 Jun 2021 23:09:52 GMT

Can you clarify what are you looking for and what your use case is? Are you asking whether there's a preference for using Spark SQL or just direct SQL with %sql or something else?

Re: using Spark SQL or particularly %SQL in a databricks notebook, is there a way to use pagination or offset or skip ?

sajith_appukutt — Thu, 24 Jun 2021 03:54:09 GMT

There is no offset support yet. Here are a few possible workarounds

If you data is all in one partition ( rarely the case 🙂 ) , you could create a column with monotonically_increasing_id and apply filter conditions. if there are multiple partitions, monotonically_increasing_id won't be consecutive

Use except ( in your case sql equivalent of code below) . This however would be an expensive operation

df1 = df.limit(10); 
df2 = df.except(df1); 
df2.limit(20);