I need to read a large text file of around 5-6 GB line by line using Java.
How can I do this quickly?
In Java 8, there is also an alternative to using Files.lines(). If your input source isn't a file but something more abstract like a Reader
or an InputStream
, you can stream the lines via the BufferedReader
s lines()
method.
For example:
try (BufferedReader reader = new BufferedReader(...)) {
reader.lines().forEach(line -> processLine(line));
}
will call processLine()
for each input line read by the BufferedReader
.
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
stream.forEach(System.out::println);
}
I usually do the reading routine straightforward:
void readResource(InputStream source) throws IOException {
BufferedReader stream = null;
try {
stream = new BufferedReader(new InputStreamReader(source));
while (true) {
String line = stream.readLine();
if(line == null) {
break;
}
//process line
System.out.println(line)
}
} finally {
closeQuiet(stream);
}
}
static void closeQuiet(Closeable closeable) {
if (closeable != null) {
try {
closeable.close();
} catch (IOException ignore) {
}
}
}
In Java 7:
String folderPath = "C:/folderOfMyFile";
Path path = Paths.get(folderPath, "myFileName.csv"); //or any text file eg.: txt, bat, etc
Charset charset = Charset.forName("UTF-8");
try (BufferedReader reader = Files.newBufferedReader(path , charset)) {
while ((line = reader.readLine()) != null ) {
//separate all csv fields into string array
String[] lineVariables = line.split(",");
}
} catch (IOException e) {
System.err.println(e);
}
You can use Scanner class
Scanner sc=new Scanner(file);
sc.nextLine();
Here is a sample with full error handling and supporting charset specification for pre-Java 7. With Java 7 you can use try-with-resources syntax, which makes the code cleaner.
If you just want the default charset you can skip the InputStream and use FileReader.
InputStream ins = null; // raw byte-stream
Reader r = null; // cooked reader
BufferedReader br = null; // buffered for readLine()
try {
String s;
ins = new FileInputStream("textfile.txt");
r = new InputStreamReader(ins, "UTF-8"); // leave charset out for default
br = new BufferedReader(r);
while ((s = br.readLine()) != null) {
System.out.println(s);
}
}
catch (Exception e)
{
System.err.println(e.getMessage()); // handle exception
}
finally {
if (br != null) { try { br.close(); } catch(Throwable t) { /* ensure close happens */ } }
if (r != null) { try { r.close(); } catch(Throwable t) { /* ensure close happens */ } }
if (ins != null) { try { ins.close(); } catch(Throwable t) { /* ensure close happens */ } }
}
Here is the Groovy version, with full error handling:
File f = new File("textfile.txt");
f.withReader("UTF-8") { br ->
br.eachLine { line ->
println line;
}
}