How to pass multiple statements into Spark SQL HiveContext

前端 未结 2 590
深忆病人
深忆病人 2021-01-18 10:17

For example I have few Hive HQL statements which I want to pass into Spark SQL:

set parquet.compression=SNAPPY;
creat         


        
相关标签:
2条回答
  • 2021-01-18 11:07

    I worked on a scenario where i needed to read a sql file and run all the; separated queries present in that file.

    One simple way to do it is like this:

    val hsc = new org.apache.spark.sql.hive.HiveContext(sc)
    val sql_file = "/hdfs/path/to/file.sql"
    val file = sc.wholeTextFiles(s"$sql_file")
    val queries = f.take(1)(0)._2
    Predef.refArrayOps(queries.split(';')).map(query => hsc.sql(query))
    
    0 讨论(0)
  • 2021-01-18 11:16

    Thank you to @SamsonScharfrichter for the answer.

    This will work:

    hiveContext.sql("set spark.sql.parquet.compression.codec=SNAPPY")
    hiveContext.sql("create table MY_TABLE stored as parquet as select * from ANOTHER_TABLE")
    val rs = hiveContext.sql("select * from MY_TABLE limit 5")
    

    Please note that in this particular case instead of parquet.compression key we need to use spark.sql.parquet.compression.codec

    0 讨论(0)
提交回复
热议问题