Analytic in Spark Dataframe

前端 未结 1 1680
灰色年华
灰色年华 2021-01-17 02:45

In this problem we have two manager M1 and M2 , In team of manager M1 have two employee e1 & e2 and in team of M2 have two employee e4 & e5 Following is the Manager

相关标签:
1条回答
  • 2021-01-17 02:58

    According to what I understood from your question, here's what I suggest you to do.

    First you need to create dataframes of managers with employees under them as

    manager1

    +---+------+
    |sn |emp_id|
    +---+------+
    |a  |e1    |
    |b  |e2    |
    +---+------+
    

    manager2

    +---+------+
    |sn |emp_id|
    +---+------+
    |a  |e4    |
    |b  |e5    |
    +---+------+
    

    Then you should write a function that will return a list of employees under a manager as

    import org.apache.spark.sql.functions._
    def getEmployees(df : DataFrame): List[String] = {
      df.select(collect_list("emp_id")).first().getAs[mutable.WrappedArray[String]](0).toList
    }
    

    The final step is to write a function that will filter only the employees passed as

    def getEmployeeDetails(df: DataFrame, list: List[String]) : DataFrame ={
      df.filter(df("emp_id").isin(list: _*))
    }
    

    now if you want to see employees under manager1(m1) then

    getEmployeeDetails(df, getEmployees(m1)).show(false)
    

    will return you

    +------+--------+------+---------+
    |emp_id|month_id|salary|work_days|
    +------+--------+------+---------+
    |e1    |1       |66000 |22       |
    |e1    |2       |48000 |16       |
    |e1    |3       |87000 |29       |
    |e2    |1       |75000 |25       |
    |e2    |4       |69000 |23       |
    |e2    |5       |66000 |22       |
    +------+--------+------+---------+
    

    you can do the same for other managers too

    you can do the same for employees too as

    getEmployeeDetails(df, List("e1")).show(false)
    

    will return the dataframe of employee1 (e1)

    +------+--------+------+---------+
    |emp_id|month_id|salary|work_days|
    +------+--------+------+---------+
    |e1    |1       |66000 |22       |
    |e1    |2       |48000 |16       |
    |e1    |3       |87000 |29       |
    +------+--------+------+---------+
    

    I hope the answer is helpful

    0 讨论(0)
提交回复
热议问题