Hive query to quickly find table size (number of rows)

前端 未结 6 1460
遇见更好的自我
遇见更好的自我 2021-01-31 09:20

Is there a Hive query to quickly find table size (i.e. number of rows) without launching a time-consuming MapReduce job? (Which is why I want to avoid COUNT(*).)

6条回答
  •  春和景丽
    2021-01-31 09:45

    Here is the quick command

    ANALYZE TABLE tablename [PARTITION(partcol1[=val1], partcol2[=val2], ...)] COMPUTE STATISTICS [noscan];
    

    For Example,If table is partitioned

     hive> ANALYZE TABLE ops_bc_log PARTITION(day) COMPUTE STATISTICS noscan;
    

    output is

    Partition logdata.ops_bc_log{day=20140523} stats: [numFiles=37, numRows=26095186, totalSize=654249957, rawDataSize=58080809507]

    Partition logdata.ops_bc_log{day=20140521} stats: [numFiles=30, numRows=21363807, totalSize=564014889, rawDataSize=47556570705]

    Partition logdata.ops_bc_log{day=20140524} stats: [numFiles=35, numRows=25210367, totalSize=631424507, rawDataSize=56083164109]

    Partition logdata.ops_bc_log{day=20140522} stats: [numFiles=37, numRows=26295075, totalSize=657113440, rawDataSize=58496087068]

提交回复
热议问题