Hey, I need to read a textfile in java. The problem is that the file has the following format:
Id time1 time2 time3 ...
ID2 time1 time2 time3 ...
Transpose the file. Ids on line 1, time1 on line 2, and so on. Of course, this is beneficial if this can be done only once and then many reads on that file are expected.
If you're on a Linux/UNIX platform, you could do some preprocessing with the cut
command
The simplest way would be to read the whole file line by line once, parsing the lines as you go - then you can very easily get "all the IDs" followed by "all the first times" etc.
If the file is too large to do that, you may want to consider writing a tool to change the file structure - open up several files for writing (one per column) then you can read an input line, write the output data to each file, move onto the next line etc. You can do this once and then read each file as and when you need it.
We can't read files column-by-column. Read the whole file into memory (FileReader
of java.nio
) and parse the content (String#split
on each line) in a datastructure like
Map<String, List<String>>
where the maps key is the id (ID, ID2, ..) and the value a simple list that contains all the time values.
One solution is to parse the file once and create an index of the positions of each ids in the file. Then, you can reposition the reading 'cursor' as needed to ids.
EDIT
This solution is practical if the whole file content cannot be loaded into memory. To limit the number of physical readings, a LRU cache keeping the most recently read or used id-times combinations could improve performance.