I\'m programming with spark streaming but have some trouble with scala. I\'m trying to use the function StreamingContext.fileStream
The definition of this function i
If you want to use fileStream
, you're going to have to supply all 3 type params to it when calling it. You need to know what your Key
, Value
and InputFormat
types are before calling it. If your types were LongWritable
, Text
and TextInputFormat
, you would call fileStream
like so:
val lines = ssc.fileStream[LongWritable, Text, TextInputFormat]("/home/sequenceFile")
If those 3 types do happen to be your types, then you might want to use textFileStream
instead as it does not require any type params and delegates to fileStream
using those 3 types I mentioned. Using that would look like this:
val lines = ssc.textFileStream("/home/sequenceFile")
val filterF = new Function[Path, Boolean] {
def apply(x: Path): Boolean = {
val flag = if(x.toString.split("/").last.split("_").last.toLong < System.currentTimeMillis) true else false
return flag
}
}
val streamed_rdd = ssc.fileStream[LongWritable, Text, TextInputFormat]("/user/hdpprod/temp/spark_streaming_input",filterF,false).map(_._2.toString).map(u => u.split('\t'))