Adding a comma separated table to Hive

前端 未结 4 1909
傲寒
傲寒 2021-01-26 08:38

I have a very basic question which is: How can I add a very simple table to Hive. My table is saved in a text file (.txt) which is saved in HDFS. I have tried to create an exter

相关标签:
4条回答
  • 2021-01-26 09:12

    I hope, below inputs will try to answer the question asked by @mshabeen.

    There are different ways that you can use to load data in Hive table that is created as external table. While creating the Hive external table you can either use the LOCATION option and specify the HDFS, S3 (in case of AWS) or File location, from where you want to load data OR you can use LOAD DATA INPATH option to load data from HDFS, S3 or File after creating the Hive table.

    Alternatively you can also use ALTER TABLE command to load data in the Hive partitions.

    Below are some details

    1. Using LOCATION - Used while creating the Hive table. In this case data is already loaded and available in Hive table.

    2. **LOAD DATA INPATH** option - This Hive command can be used to load data from specified location. Point to remember here is, the data will get MOVED from input path to Hive warehouse path. Example - LOAD DATA INPATH 'hdfs://cluster-ip/path/to/data/location/'

    3. Using ALTER TABLE command - Mostly this is used to add data from other locations into the Hive partitions. In this case it is required that all partitions are already defined and the values for the partitions are already known. In case of dynamic partitions this command is not required. Example - ALTER TABLE table_name ADD PARTITION (date_col='2018-02-21') LOCATION 'hdfs/path/to/location/' The above code will map the partition to the specified data location (in this case HDFS). However, the data will NOT MOVED to Hive internal warehouse location.

    Additional details are available here

    0 讨论(0)
  • 2021-01-26 09:17
    create external table Data (
            dummy INT,
            account_number INT, 
            balance INT, 
            firstname STRING, 
            lastname STRING, 
            age INT, 
            gender CHAR(1), 
            address STRING, 
            employer STRING, 
            email STRING,
            city STRING, 
            state CHAR(2)
        )
        row format delimited    
        FIELDS TERMINATED BY ','
        stored as textfile
        LOCATION '/Data';
    

    Then load file into table

    LOAD DATA INPATH '/KibTEst/Data.txt' INTO TABLE Data;
    

    Then

    select * from Data;
    
    0 讨论(0)
  • 2021-01-26 09:22
    1. You just need to create an external table pointing to your file location in hdfs and with delimiter properties as below:

      create external table Data (
          dummy INT,
          account_number INT, 
          balance INT, 
          firstname STRING, 
          lastname STRING, 
          age INT, 
          gender CHAR(1), 
          address STRING, 
          employer STRING, 
          email STRING,
          city STRING, 
          state CHAR(2)
      )
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ',' 
      LINES TERMINATED BY '\n'
      LOCATION 'hdfs:///KibTEst/Data.txt';
      
    2. You need to run select query(because file is already in HDFS and external table directly fetches data from it when location is specified in create statement). So you test using below select statement:

    SELECT * FROM Data;

    0 讨论(0)
  • 2021-01-26 09:23
        create external table Data (
            dummy INT,
            account_number INT, 
            balance INT, 
            firstname STRING, 
            lastname STRING, 
            age INT, 
            gender CHAR(1), 
            address STRING, 
            employer STRING, 
            email STRING,
            city STRING, 
            state CHAR(2)
        )
        row format delimited    
        FIELDS TERMINATED BY ‘,’
        stored as textfile
        LOCATION 'Your hdfs location for external table';
    

    If data in HDFS then use :

    LOAD DATA INPATH 'hdfs_file_or_directory_path' INTO TABLE tablename
    

    The use select * from table_name

    0 讨论(0)
提交回复
热议问题