I want to access a large file (file size may vary from 30 MB to 1 GB) through 10 threads and then process each line in the file and write them to another file through 10 threads
Be aware that the ideal number of threads is limited by the hardware architecture and other stuffs (you could think about consulting the thread pool to calculate the best number of threads). Assuming that "10" is a good number, we proceed. =)
If you are looking for performance, you could do the following:
Read the file using the threads you have and process each one according to your business rule. Keep one control variable that indicates the next expected line to be inserted on the output file.
If the next expected line is done processing, append it to a buffer (a Queue) (it would be ideal if you could find a way to insert direct in the output file, but you would have lock problems). Otherwise, store this "future" line inside a binary-search-tree, ordering the tree by line position. Binary-search-tree gives you a time complexity of "O(log n)" for searching and inserting, which is really fast for your context. Continue to fill the tree until the next "expected" line is done processing.
Activates the thread that will be responsible to open the output file, consume the buffer periodically and write the lines into the file.
Also, keep track of the "minor" expected node of the BST to be inserted in the file. You can use it to check if the future line is inside the BST before starting searching on it.
This approach uses - O(n) to read the file (but is parallelized) - O(1) to insert the ordered lines into a Queue - O(Logn)*2 to read and write the binary-search-tree - O(n) to write the new file
plus the costs of your business rule and I/O operations.
Hope it helps.