reading and writing from hive tables with spark after aggregation

前端 未结 3 550
故里飘歌
故里飘歌 2021-02-09 06:29

We have a hive warehouse, and wanted to use spark for various tasks (mainly classification). At times write the results back as a hive table. For example, we wrote the following

3条回答
  •  孤城傲影
    2021-02-09 06:53

    What version of spark you are using ?

    This answer is based on 1.6 & using the data frames.

    val sc = new SparkContext(conf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    
    import sqlContext.implicits._
    val client = Seq((1, "A", 10), (2, "A", 5), (3, "B", 56)).toDF("ID", "Categ", "Amnt")
    
        import org.apache.spark.sql.functions._
        client.groupBy("Categ").agg(sum("Amnt").as("Sum"), count("ID").as("count")).show()
    
    
    +-----+---+-----+
    |Categ|Sum|count|
    +-----+---+-----+
    |    A| 15|    2|
    |    B| 56|    1|
    +-----+---+-----+
    

    Hope this helps !!

提交回复
热议问题