问题
I am a newbie to Python, I want to read a file from hdfs (which I have achieved).
after reading the file I am doing some string operations and I want to write these modified contents into the output file.
Reading the file I achieved using subprocess (which took a lot of time) since open didn't work for me.
cat = Popen(["hadoop", "fs", "-cat", "/user/hdfs/test-python/input/test_replace"],stdout=PIPE)
Now, how to write to the output file with the modified contents is the question.
Your help is highly appreciated
回答1:
You can use a library for reading and writing to HDFS, like https://github.com/mtth/hdfs
来源:https://stackoverflow.com/questions/37261624/read-write-files-on-hdfs-using-python