问题
When I migrated to Scala 2.9.0 from 2.8.1, all of the code was functional except for the Hadoop mappers. Because I had some wrapper objects in the way, I distilled down to the following example:
import org.apache.hadoop.mapreduce.{Mapper, Job} object MyJob { def main(args:Array[String]) { val job = new Job(new Configuration()) job.setMapperClass(classOf[MyMapper]) } } class MyMapper extends Mapper[LongWritable,Text,Text,Text] { override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) { } }
When I run this in 2.8.1, it runs quite well (and I have plenty of production code in 2.8.1. In 2.9.0 I get the following compilation error:
error: type mismatch;
found : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])
The failing call is when I call setMapperClass on the Job object. Here's the definition of that method:
public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }
The definition of the Mapper class itself is this:
public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>
Does anyone have a sense of what I'm doing wrong? It looks to me like the type is fundamentally correct: MyMapper does extend Mapper, and the method wants something that extends Mapper. And it works great in 2.8.1...
回答1:
Silly as it seems, you can work around the problem by defining the Mapper before the Job. The following compiles:
import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._
class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
}
}
object MyJob {
def main(args:Array[String]) {
val job = new Job(new Configuration())
job.setMapperClass(classOf[MyMapper])
}
}
来源:https://stackoverflow.com/questions/6028221/how-does-one-implement-a-hadoop-mapper-in-scala-2-9-0