How do we get the 1000 tables description using hive?

前端 未结 2 1946
甜味超标
甜味超标 2021-01-25 00:27

I have 1000 tables, need to check the describe

; for one by one. Instead of running one by one, can you please give me one command to fetch \"N\"
相关标签:
2条回答
  • 2021-01-25 00:57

    Query the metastore

    Demo

    Hive

    create database my_db_1;
    create database my_db_2;
    create database my_db_3;
    
    create table my_db_1.my_tbl_1 (i int);
    create table my_db_2.my_tbl_2 (c1 string,c2 date,c3 decimal(12,2));
    create table my_db_3.my_tbl_3 (x array<int>,y struct<i:int,j:int,k:int>);
    

    MySQL (Metastore)

    use metastore
    ;
    
    select      d.name              as db_name
               ,t.tbl_name      
               ,c.integer_idx + 1   as col_position
               ,c.column_name
               ,c.type_name
               
    
    from                DBS         as d
    
                join    TBLS        as t
                
                on      t.db_id =
                        d.db_id
                        
                join    SDS         as s
                
                on      s.sd_id =
                        t.sd_id        
                        
                join    COLUMNS_V2  as c
                
                on      c.cd_id =
                        s.cd_id
                    
    where       d.name like 'my\_db\_%'
    
    order by    d.name         
               ,t.tbl_name  
               ,c.integer_idx
    ;
    

    +---------+----------+--------------+-------------+---------------------------+
    | db_name | tbl_name | col_position | column_name |         type_name         |
    +---------+----------+--------------+-------------+---------------------------+
    | my_db_1 | my_tbl_1 |            1 | i           | int                       |
    | my_db_2 | my_tbl_2 |            1 | c1          | string                    |
    | my_db_2 | my_tbl_2 |            2 | c2          | date                      |
    | my_db_2 | my_tbl_2 |            3 | c3          | decimal(12,2)             |
    | my_db_3 | my_tbl_3 |            1 | x           | array<int>                |
    | my_db_3 | my_tbl_3 |            2 | y           | struct<i:int,j:int,k:int> |
    +---------+----------+--------------+-------------+---------------------------+
    
    0 讨论(0)
  • 2021-01-25 01:07

    You can make a shell script and call it with a parameter. For example following script receives schema, prepares list of tables in the schema, calls DESCRIBE EXTENDED command, extracts location, prints table location for first 1000 tables in the schema ordered by name. You can modify and use it as a single command:

    #!/bin/bash
    
    #Create table list for a schema (script parameter)
    HIVE_SCHEMA=$1
    echo Processing Hive schema $HIVE_SCHEMA...
    tablelist=tables_$HIVE_SCHEMA
    
     hive -e " set hive.cli.print.header=false; use $HIVE_SCHEMA; show tables;" 1>  $tablelist
    
    #number of tables
    tableNum_limit=1000
    
    #For each table do:
    for table in $(cat $tablelist|sort|head -n "$tableNum_limit") #add proper sorting
     do 
    
     echo Processing table $table ...
    
         #Call DESCRIBE
         out=$(hive client -S -e "use $HIVE_SCHEMA; DESCRIBE EXTENDED $table")
    
         #Get location for example
         table_location=$(echo "${out}" | egrep -o 'location:[^,]+' | sed 's/location://')
         echo Table location: $table_location
         #Do something else here
    
    done 
    
    0 讨论(0)
提交回复
热议问题