发表新帖

发表新帖

Is spark streaming works with both “cp” and “mv”

后端未结

关注

 1  701

I am using spark streaming

My program continuously read streams from a hadoop folder .The problem is If I copy to my hadoop folder( hadoop fs -copyFromLocal) the spark

相关标签:

1条回答

星月不相逢

2021-01-27 10:23

Got it ..It works in spark 1.5 But it picks only those files whose timestamp equal to current time stamp .

For Example

Temp Folder : file f.txt (timestamp t1: when the file was created)

Spark Input folder : /input

when you do a mv ( hadoop fs -mv /temp/f.txt /input) : Spark will not pick

But after moving if you change the timestamp of the moved file , spark will pick .

https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala

Had to check the source code of spark .

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题