I\'m looking for a way to centralise the logging concerns of distributed software (written in Java) which would be quite easy, since the system in question has only one serv
There's a ready-to-use solution from Facebook - Scribe - that is using Apache Hadoop under the hood. However, most companies I'm aware of still tend to develop in-house systems for that. I worked in one such company and dealt with logs there about two years ago. We also used Hadoop. In our case we had the following setup:
We had a small and fixed number of reports that we were interested in. In rare cases when we wanted to perform a different kind of analysis we would simply add a specialized reducer code for that and optionally run it against old logs.
If you can't decide what kind of analyses you are interested in in advance then it'll be better to store structured data prepared by workers in HBase or some other NoSQL database (here, for example, people use Mongo DB). That way you won't need to re-aggregate data from the raw logs and will be able to query the datastore instead.
There are a number of good articles about such logging aggregation solutions, for example, using Pig to query the aggregated data. Pig lets you query large Hadoop-based datasets with SQL-like queries.
NXLOG or LogStash or Graylogs2
or
LogStash + ElasticSearch (+optionally Kibana)
Example:
1) http://logstash.net/docs/1.3.3/tutorials/getting-started-simple
2) http://logstash.net/docs/1.3.3/tutorials/getting-started-centralized
You can use Log4j with the SocketAppender, thus you have to write the server part as LogEvent processing. see http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/net/SocketAppender.html
Have a look at logFaces, looks like your specifications are met. http://www.moonlit-software.com/