R using RJDBC write table to Hive

前端 未结 2 1352
逝去的感伤
逝去的感伤 2021-01-18 15:52

I have successfully connected local R3.1.2( win7 64bit rstudio) and remote hive server using rjdbc,

library(RJDBC)
.jinit()
dir = \"E:/xxx/jars         


        
相关标签:
2条回答
  • 2021-01-18 16:18

    I have a partial answer. Your arguments to dbWriteTable are reversed. The pattern is dbWriteTable(connection, tableName, data), the docs read dbWriteTable(conn, name, value, ...). That being said, I don't find that the 'correct' form works either, instead yielding the following error message:

    Error in .local(conn, statement, ...) : 
      execute JDBC update query failed in dbSendUpdate ([Simba][HiveJDBCDriver](500051) ERROR processing query/statement. Error Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:42000, errorCode:40000, errorMessage:Error while compiling statement: FAILED: ParseException line 1:41 mismatched input 'PRECISION' expecting ) near 'DOUBLE' in create table statement), Query: CREATE TABLE iris (`Sepal.Length` DOUBLE PRECISION,`Sepal.Width` DOUBLE PRECISION,`Petal.Length` DOUBLE PRECISION,`Petal.Width` DOUBLE PRECISION,Species VARCHAR(255)).)
    

    (at least when using Amazon's JDBC driver for Hive). That error at least seems self apparent, the query generated to make the table for data insertion didn't parse correctly in HiveQL. The fix, other than doing it manually, I'm not sure about.

    0 讨论(0)
  • 2021-01-18 16:24

    through these years, I still cannot find a full solution...but here is also a partial one, only available for write small data.frame and how small vary from 32/64bit , mac/win ...

    first change dataframe to character vector

    data2hodoop <- paste0( apply(dataframe, 1, function(x) paste0("('", paste0(x, collapse = "', '"), "')")), collapse = ", ")

    then use insert to write lines into hadoop

    dbSendQuery(conn, paste("INSERT INTO ", tbname," VALUES ",data2hodoop, ";" ))

    in my PC, WIN7 64BIT 16G, if the vector 'data2hodoop' larger than 50M, there will be an error " C stack usage xxx is too close to the limit";

    in my mac, the limit is even lower, and I can not find a way to modify this limit.

    0 讨论(0)
提交回复
热议问题