I have a Java application that makes heavy use of a large file, to read, process and give through to SolrEmbeddedServer (http://lucene.apache.org/solr/).
One of the func
It's much easier and more standard to use http://commons.apache.org/lang/. It's very easy and simple.
Each call to replace returns a new String. Each time you call this function, you are essentially creating four copies of Strings which are going to be immediately discarded. If input is large enough, this can be wasteful.
I would suggest revising your algorithm so that instead of performing N replace
operations (which needs to scan the String each time), you only scan the list once:
//psuedocode
Map<Char, String> replacements = new HashMap<String, String>();
replacements.put("&", "&");
replacements.put(">", ">");
...
private String htmlEscape(String input) {
StringBuilder sb = new StringBuilder(input.length());
for (char c: sb.toCharArray()) {
if (replacements.containsKey(c)) {
sb.append(replacements.get(c));
else {
sb.append(c);
}
return sb.toString();
}