How to map over DataFrame in spark to extract RowData and make predictions using h2o mojo model

五迷三道 提交于 2019-12-06 05:37:38

Use this function to prepare RowData object needed for H2O:

def rowToRowData(df: DataFrame, row: Row): RowData = {
  val rowAsMap = row.getValuesMap[Any](df.schema.fieldNames)
  val rowData = rowAsMap.foldLeft(new RowData()) { case (rd, (k,v)) => 
    if (v != null) { rd.put(k, v.toString) }
    rd
  }
  rowData
}

I have a complete answer here: https://stackoverflow.com/a/47898040/9120484 You can call map on df directly instead of on rdd.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!