Does anyone how to do pagination in spark sql query?
I need to use spark sql but don\'t know how to do pagination.
Tried:
select * from pers
There is no support for offset as of now in spark sql. One of the alternatives you can use for paging is through DataFrames
using except
method.
Example: You want to iterate with a paging limit of 10, you can do the following:
DataFrame df1;
long count = df.count();
int limit = 10;
while(count > 0){
df1 = df.limit(limit);
df1.show(); //will print 10, next 10, etc rows
df = df.except(df1);
count = count - limit;
}
If you want to do say, LIMIT 50, 100
in the first go, you can do the following:
df1 = df.limit(50);
df2 = df.except(df1);
df2.limit(100); //required result
Hope this helps!
karthik's answer will fail if there are duplicate rows in the dataframe. 'except' will remove all rows in df1 which are in df2 .
val filteredRdd = df.rdd.zipWithIndex().collect { case (r, i) if 10 >= start && i <=20 => r }
val newDf = sqlContext.createDataFrame(filteredRdd, df.schema)