I am trying to group some data by key where the value would be a list:
Sample data:
A 1
A 2
B 1
B 2
Expected result:
(A
When you write an anonymous inline function of the form
ARGS => OPERATION
the entire part before the arrow (=>
) is taken as the argument list. So, in the case of
(k, v) => ...
the interpreter takes that to mean a function that takes two arguments. In your case, however, you have a single argument which happens to be a tuple (here, a Tuple2
, or a Pair
- more fully, you appear to have a list of Pair[Any,List[Any]]
). There are a couple of ways to get around this. First, you can use the sugared form of representing a pair, wrapped in an extra set of parentheses to show that this is the single expected argument for the function:
((x, y)) => ...
or, you can write the anonymous function in the form of a partial function that matches on tuples:
groupedData.map( case (k,v) => (k,v(0)) )
Finally, you can simply go with a single specified argument, as per your last attempt, but - realising it is a tuple - reference the specific field(s) within the tuple that you need:
groupedData.map(s => (s._2(0),s._2(1))) // The key is s._1, and the value list is s._2