I have a database driven website serving about 50,000 pages.
I want to track each webpage/record hit. I will do this by creating logs, and then batch processing the logs
I've done something similar. I log each record to a separate file, then I have a batch process that grabs the files, puts them into a tar file and uploads them to the central log server (in my case, S3 :)).
I generate random file names for each log entry. I do this to avoid locking files for rotation. It's really easy to archive/delete this way.
I use json as my log format instead of the typical white space delimited log files. This makes it easier to parse and add fields in the future. It also means it's easier for me to write an entry per file than appending multiple records per file.
I've also used log4php+syslog-ng to centralize logging in real time. I have log4php log to syslog, which then forwards to the logs to my central server. This is really useful on larger clusters. One caveat is that there's a length limit to syslog messages, so you risk longer messages being truncated.