How to pass Encoder as parameter to dataframe's as method

只谈情不闲聊 提交于 2019-12-24 09:48:12

问题


I want to convert dataFrame to dataSet by using different case class. Now, my code is like below.

case Class Views(views: Double)
case Class Clicks(clicks: Double)

def convertViewsDFtoDS(df: DataFrame){
    df.as[Views]
}

def convertClicksDFtoDS(df: DataFrame){
    df.as[Clicks]
}

So, my question is "Is there anyway I can use one general function to this by pass case class as extra parameter to this function?"


回答1:


It seems a bit obsolete (as method does exactly what you want) but you can

import org.apache.spark.sql.{Encoder, Dataset, DataFrame}

def convertTo[T : Encoder](df: DataFrame): Dataset[T] = df.as[T]

or

def convertTo[T](df: DataFrame)(implicit enc: Encoder[T]): Dataset[T] = df.as[T]

Both methods are equivalent and express exactly the same thing (existence of an implicit Encoder for a type T).

If you want to avoid implicit parameter you can use explicit Encoder all the way down:

def convertTo[T](df: DataFrame, enc: Encoder[T]): Dataset[T] = df.as[T](enc)

convertTo(df, encoderFor[Clicks])


来源:https://stackoverflow.com/questions/40692691/how-to-pass-encoder-as-parameter-to-dataframes-as-method

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!