I would like to sort a file on more fields. A sample tab separated file is:
a 1 1.0
b 2 0.1
c 3 0.3
a 4 0.001
c 5 0.5
a 6 0.01
b 7
The manual shows some examples.
In accordance with zseder's comment, this works:
sort -t"<TAB>" -k1,1d -k3,3g
Tab should theoretically work also like this sort -t"\t"
.
If none of the above work to delimit by tab, this is an ugly workaround:
TAB=`echo -e "\t"`
sort -t"$TAB"
Here is a Python script that you might use as a starting point:
#!/usr/bin/env python2.6
import sys
import string
def main():
fname = sys.argv[1]
data = []
with open(fname, "rt") as stream:
for line in stream:
line = line.strip()
a, b, c = line.split()
data.append((a, int(b), float(c)))
data.sort(key=my_key)
print data
def my_key(item):
a, b, c = item
return c, lexicographical_key(a)
def lexicographical_key(a):
# poor man's attempt, should use Unicode classification etc.
return a.translate(None, string.punctuation)
if __name__ == "__main__":
main()