Is spark streaming works with both “cp” and “mv”

后端 未结 1 700
你的背包
你的背包 2021-01-27 10:09

I am using spark streaming

My program continuously read streams from a hadoop folder .The problem is If I copy to my hadoop folder( hadoop fs -copyFromLocal) the spark

相关标签:
1条回答
  • 2021-01-27 10:23

    Got it ..It works in spark 1.5 But it picks only those files whose timestamp equal to current time stamp .

    For Example

    Temp Folder : file f.txt (timestamp t1: when the file was created)

    Spark Input folder : /input

    when you do a mv ( hadoop fs -mv /temp/f.txt /input) : Spark will not pick

    But after moving if you change the timestamp of the moved file , spark will pick .

    https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala

    Had to check the source code of spark .

    0 讨论(0)
提交回复
热议问题