Hadoop FileSplit reading
问题 Assume a client application that uses a FileSplit object in order to read the actual bytes from the corresponding file. To do so, an InputStream object has to be created from the FileSplit , via code like: FileSplit split = ... // The FileSplit reference FileSystem fs = ... // The HDFS reference FSDataInputStream fsin = fs.open(split.getPath()); long start = split.getStart()-1; // Byte before the first if (start >= 0) { fsin.seek(start); } The adjustment of the stream by -1 is present in some