I have a large text file(1.5 Gb) having 100 millions Strings(no duplicate String) and all the Strings are arranged line by line in the file . i want to make a wepapplication in
Is there expected to be a lot of overlap in your keywords? If so, you might be able to store a hash map from keyword (String
) to file locations (ArrayList
). You can not store all the lines in memory though with the object overhead.
Once you have the file location, you can seek to that location in the text file and then look nearby to get the enclosing newline characters, returning the line. That will definitely be less than 4 seconds. Here is a little info on that. If this is just for a little exercise, that would work fine.
A better solution though would be a two tiered index, one mapping keywords to line numbers, and then another mapping line numbers to line text. This will not fit in memory on your machine. There are great disk based key-value stores though that would work well. If this is anything beyond a toy problem, go with the Reddis route.