Hadoop/Hive : Loading data from .csv on a local machine

后端未结

关注

 6  1350

不知归路 2020-12-24 05:20

As this is coming from a newbie...

I had Hadoop and Hive set up for me, so I can run Hive queries on my computer accessing data on AWS cluster. Can I run Hive querie

6条回答

有刺的猬 (楼主)

2020-12-24 06:11

You may try this, Following are few examples on how files are generated. Tool -- https://sourceforge.net/projects/csvtohive/?source=directory

Select a CSV file using Browse and set hadoop root directory ex: /user/bigdataproject/

Tool Generates Hadoop script with all csv files and following is a sample of generated Hadoop script to insert csv into Hadoop

#!/bin/bash -v

hadoop fs -put ./AllstarFull.csv /user/bigdataproject/AllstarFull.csv
hive -f ./AllstarFull.hive


hadoop fs -put ./Appearances.csv /user/bigdataproject/Appearances.csv
hive -f ./Appearances.hive


hadoop fs -put ./AwardsManagers.csv /user/bigdataproject/AwardsManagers.csv
hive -f ./AwardsManagers.hive

Sample of generated Hive scripts

CREATE DATABASE IF NOT EXISTS lahman;

USE lahman;

CREATE TABLE AllstarFull (playerID string,yearID string,gameNum string,gameID string,teamID string,lgID string,GP string,startingPos string) row format delimited fields terminated by ',' stored as textfile;

LOAD DATA INPATH '/user/bigdataproject/AllstarFull.csv' OVERWRITE INTO TABLE AllstarFull;

SELECT * FROM AllstarFull;

Thanks Vijay

0 讨论(0)

查看其它6个回答