I have a a couple of Apache log files that have been appended together and I need to sort them by date. They\'re in the following format:
\"www.company.com\" 19
#!/bin/sh
if [ ! -f $1 ]; then
echo "Usage: $0 "
exit
fi
echo "Sorting $1"
sort -t ' ' -k 4.9,4.12n -k 4.5,4.7M -k 4.2,4.3n -k 4.14,4.15n -k 4.17,4.18n -k 4.20,4.21n $1 > $2
I figured this out with online examples, skimming through 'The Linux Command Line' book, man pages, and trial-and-error:
sort -k 3.9nb -k 3.5Mb -k 3.2nb [location and name of file]
The b along with the n or M will stop sort from reading characters that do not make sense such as / and : which makes life easier when the space is already used as a delimiter and you still have to separate by :, /, and/or any other character you wish smite when sorting.
The above script will sort by year first, then by month and then by date. Place an r next to the all the b's to descend.
This is almost too trivial to point out, but just in case it confuses anyone: grm's answer should technically be using field #3, not 4, to match the questioner's exact log format. That is, it should read:
sort -t ' ' -k 3.9,3.12n -k 3.5,3.7M ...
His answer is correct in every other respect, and can be used as-is for the common log format.