Java : Read last n lines of a HUGE file

后端 未结 11 1653
温柔的废话
温柔的废话 2020-11-27 04:59

I want to read the last n lines of a very big file without reading the whole file into any buffer/memory area using Java.

I looked around the JDK APIs and Apache Com

相关标签:
11条回答
  • 2020-11-27 05:23

    The ReversedLinesFileReader can be found in the Apache Commons IO java library.

        int n_lines = 1000;
        ReversedLinesFileReader object = new ReversedLinesFileReader(new File(path));
        String result="";
        for(int i=0;i<n_lines;i++){
            String line=object.readLine();
            if(line==null)
                break;
            result+=line;
        }
        return result;
    
    0 讨论(0)
  • 2020-11-27 05:25
    package com.uday;
    
    import java.io.File;
    import java.io.RandomAccessFile;
    
    public class TailN {
        public static void main(String[] args) throws Exception {
            long startTime = System.currentTimeMillis();
    
            TailN tailN = new TailN();
            File file = new File("/Users/udakkuma/Documents/workspace/uday_cancel_feature/TestOOPS/src/file.txt");
            tailN.readFromLast(file);
    
            System.out.println("Execution Time : " + (System.currentTimeMillis() - startTime));
    
        }
    
        public void readFromLast(File file) throws Exception {
            int lines = 3;
            int readLines = 0;
            StringBuilder builder = new StringBuilder();
            try (RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r")) {
                long fileLength = file.length() - 1;
                // Set the pointer at the last of the file
                randomAccessFile.seek(fileLength);
    
                for (long pointer = fileLength; pointer >= 0; pointer--) {
                    randomAccessFile.seek(pointer);
                    char c;
                    // read from the last, one char at the time
                    c = (char) randomAccessFile.read();
                    // break when end of the line
                    if (c == '\n') {
                        readLines++;
                        if (readLines == lines)
                            break;
                    }
                    builder.append(c);
                    fileLength = fileLength - pointer;
                }
                // Since line is read from the last so it is in reverse order. Use reverse
                // method to make it correct order
                builder.reverse();
                System.out.println(builder.toString());
            }
    
        }
    }
    
    0 讨论(0)
  • 2020-11-27 05:30

    CircularFifoBuffer from apache commons . answer from a similar question at How to read last 5 lines of a .txt file into java

    Note that in Apache Commons Collections 4 this class seems to have been renamed to CircularFifoQueue

    0 讨论(0)
  • 2020-11-27 05:30

    Here is the best way I've found to do it. Simple and pretty fast and memory efficient.

    public static void tail(File src, OutputStream out, int maxLines) throws FileNotFoundException, IOException {
        BufferedReader reader = new BufferedReader(new FileReader(src));
        String[] lines = new String[maxLines];
        int lastNdx = 0;
        for (String line=reader.readLine(); line != null; line=reader.readLine()) {
            if (lastNdx == lines.length) {
                lastNdx = 0;
            }
            lines[lastNdx++] = line;
        }
    
        OutputStreamWriter writer = new OutputStreamWriter(out);
        for (int ndx=lastNdx; ndx != lastNdx-1; ndx++) {
            if (ndx == lines.length) {
                ndx = 0;
            }
            writer.write(lines[ndx]);
            writer.write("\n");
        }
    
        writer.flush();
    }
    
    0 讨论(0)
  • 2020-11-27 05:31

    If you use a RandomAccessFile, you can use length and seek to get to a specific point near the end of the file and then read forward from there.

    If you find there weren't enough lines, back up from that point and try again. Once you've figured out where the Nth last line begins, you can seek to there and just read-and-print.

    An initial best-guess assumption can be made based on your data properties. For example, if it's a text file, it's possible the line lengths won't exceed an average of 132 so, to get the last five lines, start 660 characters before the end. Then, if you were wrong, try again at 1320 (you can even use what you learned from the last 660 characters to adjust that - example: if those 660 characters were just three lines, the next try could be 660 / 3 * 5, plus maybe a bit extra just in case).

    0 讨论(0)
提交回复
热议问题