Get a few lines of HDFS data

前端 未结 9 1846
一整个雨季
一整个雨季 2021-02-04 02:17

I am having a 2 GB data in my HDFS.

Is it possible to get that data randomly. Like we do in the Unix command line

cat iris2.cs         


        
9条回答
  •  难免孤独
    2021-02-04 03:10

    My suggestion would be to load that data into Hive table, then you can do something like this:

    SELECT column1, column2 FROM (
        SELECT iris2.column1, iris2.column2, rand() AS r
        FROM iris2
        ORDER BY r
    ) t
    LIMIT 50;
    

    EDIT: This is simpler version of that query:

    SELECT iris2.column1, iris2.column2
    FROM iris2
    ORDER BY rand()
    LIMIT 50;
    

提交回复
热议问题