Replicate Spark Row N-times

前端 未结 3 1743
南旧
南旧 2020-12-09 07:02

I want to duplicate a Row in a DataFrame, how can I do that?

For example, I have a DataFrame consisting of 1 Row, and I want to make a DataFrame with 100 identical R

3条回答
  •  有刺的猬
    2020-12-09 07:07

    You could pick out the single row, make a list with a hundred elements, populated with that row and convert it back into a dataframe.

    import org.apache.spark.sql.DataFrame
    
    val testDf = sc.parallelize(Seq(
        (1,2,3), (4,5,6)
    )).toDF("one", "two", "three")
    
    def replicateDf(n: Int, df: DataFrame) = sqlContext.createDataFrame(
        sc.parallelize(List.fill(n)(df.take(1)(0)).toSeq), 
        df.schema)
    
    val replicatedDf = replicateDf(100, testDf)
    

提交回复
热议问题