Scala: How to get a range of rows in a dataframe

后端 未结 1 493
礼貌的吻别
礼貌的吻别 2021-02-09 13:56

I have a DataFrame created by running sqlContext.read of a Parquet file.

The DataFrame consists of 300 M rows. I need to use these

相关标签:
1条回答
  • 2021-02-09 14:29

    You can simple use the limit and except api of dataset or dataframes as follows

    long count = df.count();
    int limit = 50;
    while(count > 0){
        df1 = df.limit(limit);
        df1.show();            //will print 50, next 50, etc rows
        df = df.except(df1);
        count = count - limit;
    }
    
    0 讨论(0)
提交回复
热议问题