how to implement spark sql pagination query

后端 未结 2 1845
小蘑菇
小蘑菇 2021-01-17 17:38

Does anyone how to do pagination in spark sql query?

I need to use spark sql but don\'t know how to do pagination.

Tried:

select * from pers         


        
相关标签:
2条回答
  • 2021-01-17 18:01

    There is no support for offset as of now in spark sql. One of the alternatives you can use for paging is through DataFrames using except method.

    Example: You want to iterate with a paging limit of 10, you can do the following:

        DataFrame df1;
        long count = df.count();
        int limit = 10;
        while(count > 0){
            df1 = df.limit(limit);
            df1.show();            //will print 10, next 10, etc rows
            df = df.except(df1);
            count = count - limit;
        }
    

    If you want to do say, LIMIT 50, 100 in the first go, you can do the following:

            df1 = df.limit(50);
            df2 = df.except(df1);
            df2.limit(100);       //required result
    

    Hope this helps!

    0 讨论(0)
  • 2021-01-17 18:10

    karthik's answer will fail if there are duplicate rows in the dataframe. 'except' will remove all rows in df1 which are in df2 .

    val filteredRdd = df.rdd.zipWithIndex().collect { case (r, i) if 10 >= start && i <=20 => r }
    val newDf = sqlContext.createDataFrame(filteredRdd, df.schema)
    
    0 讨论(0)
提交回复
热议问题