spark streaming fileStream

前端 未结 2 940
礼貌的吻别
礼貌的吻别 2021-01-02 07:42

I\'m programming with spark streaming but have some trouble with scala. I\'m trying to use the function StreamingContext.fileStream

The definition of this function i

相关标签:
2条回答
  • 2021-01-02 07:54

    If you want to use fileStream, you're going to have to supply all 3 type params to it when calling it. You need to know what your Key, Value and InputFormat types are before calling it. If your types were LongWritable, Text and TextInputFormat, you would call fileStream like so:

    val lines = ssc.fileStream[LongWritable, Text, TextInputFormat]("/home/sequenceFile")
    

    If those 3 types do happen to be your types, then you might want to use textFileStream instead as it does not require any type params and delegates to fileStream using those 3 types I mentioned. Using that would look like this:

    val lines = ssc.textFileStream("/home/sequenceFile")
    
    0 讨论(0)
  • 2021-01-02 07:57
    val filterF = new Function[Path, Boolean] {
        def apply(x: Path): Boolean = {
          val flag = if(x.toString.split("/").last.split("_").last.toLong < System.currentTimeMillis) true else false
          return flag
        }
    }
    
    val streamed_rdd = ssc.fileStream[LongWritable, Text, TextInputFormat]("/user/hdpprod/temp/spark_streaming_input",filterF,false).map(_._2.toString).map(u => u.split('\t'))
    
    0 讨论(0)
提交回复
热议问题