I have a set of dockerized applications scattered across multiple servers and trying to setup production-level centralized logging with ELK. I\'m ok with the ELK part itself, bu
Using filebeat you can just pipe docker logs
output as you've described. Behavior you are seeing definitely sounds like a bug, but can also be the partial line read configuration hitting you (resend partial lines until newline symbol is found).
A problem I see with piping is possible back pressure in case no logstash is available. If filebeat can not send any events, it will buffer up events internally and at some point stop reading from stdin. No idea how/if docker protects from stdout becoming unresponsive. Another problem with piping might be restart behavior of filebeat + docker if you are using docker-compose. docker-compose by default reuses images + image state. So when you restart, you will ship all old logs again (given the underlying log file has not been rotated yet).
Instead of piping you can try to read the log files written by docker to the host system. The default docker log driver is the json log driver . You can and should configure the json log driver to do log-rotation + keep some old files (for buffering up on disk). See max-size and max-file options. The json driver puts one line of 'json' data for every line to be logged. On the docker host system the log files are written to /var/lib/docker/containers/container_id/container_id-json.log . These files will be forwarded by filebeat to logstash. If logstash or network becomes unavailable or filebeat is restarted, it continues forwarding log lines where it left of (given files have been not deleted due to log rotation). No events will be lost. In logstash you can use the json_lines codec or filter to parse the json lines and a grok filter to gain some more information from your logs.
There has been some discussion about using libbeat (used by filebeat for shipping log files) to add a new log driver to docker. Maybe it is possible to collect logs via dockerbeat in the future by using the docker logs api (I'm not aware of any plans about utilising the logs api, though).
Using syslog is also an option. Maybe you can get some syslog relay on your docker host load balancing log events. Or have syslog write log files and use filebeat to forward them. I think rsyslog has at least some failover mode. You can use logstash syslog input plugin and rsyslog to forward logs to logstash with failover support in case the active logstash instance becomes unavailable.