I want to read a local txt file and read the text in this file. After that i want to split this whole text into Strings like in the example below .
Example : Lets say
It may depend on how the file is encoded, so I would likely do the following:
String.split("(\\n\\r|\\n|\\r){2}");
Some text files encode newlines as "\n\r" while others may be simply "\n". Two new lines in a row means you have an empty line.
you can split a string to an array by
String.split();
if you want it by new lines it will be
String.split("\\n\\n");
UPDATE*
If I understand what you are saying then john.
then your code will essentially be
BufferedReader in
= new BufferedReader(new FileReader("foo.txt"));
List<String> allStrings = new ArrayList<String>();
String str ="";
while(true)
{
String tmp = in.readLine();
if(tmp.isEmpty())
{
if(!str.isEmpty())
{
allStrings.add(str);
}
str= "";
}
else if(tmp==null)
{
break;
}
else
{
if(str.isEmpty())
{
str = tmp;
}
else
{
str += "\\n" + tmp;
}
}
}
Might be what you are trying to parse.
Where allStrings is a list of all of your strings.
@Kevin code works fine and as he mentioned that the code was not tested, here are the 3 changes required:
1.The if check for (tmp==null) should come first, otherwise there will be a null pointer exception.
2.This code leaves out the last set of lines being added to the ArrayList. To make sure the last one gets added, we have to include this code after the while loop: if(!str.isEmpty()) { allStrings.add(str); }
3.The line str += "\n" + tmp; should be changed to use \n instead if \\n. Please see the end of this thread, I have added the entire code so that it can help
BufferedReader in
= new BufferedReader(new FileReader("foo.txt"));
List<String> allStrings = new ArrayList<String>();
String str ="";
List<String> allStrings = new ArrayList<String>();
String str ="";
while(true)
{
String tmp = in.readLine();
if(tmp==null)
{
break;
}else if(tmp.isEmpty())
{
if(!str.isEmpty())
{
allStrings.add(str);
}
str= "";
}else
{
if(str.isEmpty())
{
str = tmp;
}
else
{
str += "\n" + tmp;
}
}
}
if(!str.isEmpty())
{
allStrings.add(str);
}
I would suggest more general regexp:
text.split("(?m)^\\s*$");
In this case it would work correctly on any end-of-line convention, and also would treat the same empty and blank-space-only lines.
The below code would work even if there are more than 2 empty lines between useful data.
import java.util.regex.*;
// read your file and store it in a string named str_file_data
Pattern p = Pattern.compile("\\n[\\n]+"); /*if your text file has \r\n as the newline character then use Pattern p = Pattern.compile("\\r\\n[\\r\\n]+");*/
String[] result = p.split(str_file_data);
(I did not test the code so there could be typos.)
Godwin was on the right track, but I think we can make this work a bit better. Using the '[ ]' in regx is an or, so in his example if you had a \r\n that would just be a new line not an empty line. The regular expression would split it on both the \r and the \n, and I believe in the example we were looking for an empty line which would require a either a \n\r\n\r, a \r\n\r\n, a \n\r\r\n, a \r\n\n\r, or a \n\n or a \r\r
So first we want to look for either \n\r or \r\n twice, with any combination of the two being possible.
String.split(((\\n\\r)|(\\r\\n)){2}));
next we need to look for \r without a \n after it
String.split(\\r{2});
lastly, lets do the same for \n
String.split(\\n{2});
And all together that should be
String.split("((\\n\\r)|(\\r\\n)){2}|(\\r){2}|(\\n){2}");
Note, this works only on the very specific example of using new lines and character returns. I in ruby you can do the following which would encompass more cases. I don't know if there is an equivalent in Java.
.match($^$)