Unable to read the streaming data from the single file in Spark streaming

后端 未结 1 775
旧时难觅i
旧时难觅i 2021-01-24 04:44

I am trying to read the streaming data from the text file which gets appended continuously using Spark streaming API \"textFileStream\". But unable to read the continuous data w

相关标签:
1条回答
  • 2021-01-24 05:26

    This an expected behavior. For file based sources (like fileStream):

    • The files must be created in the dataDirectory by atomically moving or renaming them into the data directory.
    • Once moved, the files must not be changed. So if the files are being continuously appended, the new data will not be read.

    If you want to read continuously appended you'll have to create your own source, or use separate process, which will monitor changes, and push records to for example Kafka (though it is rare to combine Spark with file systems that support appending).

    0 讨论(0)
提交回复
热议问题