How to split CSV lines into tuples with Spark Scala

前端 未结 1 1355
一个人的身影
一个人的身影 2021-01-29 10:07

Here is a data I want to retrieve by Scala. The data looks like this: userId,movieId 1,1172 1,1405 1,2193 1,2968 2,52 2,144 2,248

First I want to skip the first line

相关标签:
1条回答
  • 2021-01-29 11:03

    Ah, your question is not about the header, but about how to split the lines into (userid, movieid)? Instead of .flatMap(line=>line.split(",")) you should try this:

    .map(line => line.split(",") match { case Array(userid, movieid) => (userid, movieid) })
    
    0 讨论(0)
提交回复
热议问题