I\'m looking for kind of solution for parsing the Varnish
log file. It looks like:
178.232.38.87 - - [23/May/2012:14:01:05 +0200] \"GET http://s
I'd come up with a way to build a regular expression from chunks matching the individual fields according to their possible/expected values.
String rexa = "(\\d+(?:\\.\\d+){3})"; // an IP address
String rexs = "(\\S+)"; // a single token (no spaces)
String rexdt = "\\[([^\\]]+)\\]"; // something between [ and ]
String rexstr = "\"([^\"]*?)\""; // a quoted string
String rexi = "(\\d+)"; // unsigned integer
String rex = String.join( " ", rexa, rexs, rexs, rexdt, rexstr,
rexi, rexi, rexstr, rexstr );
Pattern pat = Pattern.compile( rex );
Matcher mat = pat.matcher( h );
if( mat.matches() ){
for( int ig = 1; ig <= mat.groupCount(); ig++ ){
System.out.println( mat.group( ig ) );
}
}
It is, of course, possible to make do with rexs in place of rexa or rexi.