问题
I have some output like this from ls -alth
:
drwxr-xr-x 5 root admin 170B Aug 3 2016 ..
drwxr-xr-x 5 root admin 70B Aug 3 2016 ..
drwxr-xr-x 5 root admin 3B Aug 3 2016 ..
drwxr-xr-x 5 root admin 9M Aug 3 2016 ..
Now, I want to parse out the 170B
part, which is obviously the size in human readable format. I wanted to do this using cut
or sed
, because I don't want to use tools that are any more complicated/difficult to use than necessary.
Ideally I want it to be robust enough to handle the B
, M
or K
suffix that comes with the size, and multiply accordingly by 1
, 1000000
and 1000
accordingly. I haven't found a good way to do that, though.
I've tried a few things without really knowing the best approach:
ls -alth | cut -f 5 -d \s+
I was hoping that would work because I'd be able to just delimit it on one or more spaces.
But that doesn't work. How do I supply cut
with a regex delimiter? or is there an easier way to extract only the size of the file from ls -alth
?
I'm using CentOS6.4
回答1:
This answer tackles the question as asked, but consider George Vasiliou's helpful find solution as a potentially superior alternative.
cut
only supports a single, literal character as the delimiter (-d
), so it isn't the right tool to use.For extracting tokens (fields) that are separated with a variable amount of whitespace per line,
awk
is the best tool, so the solution proposed by George Vasiliou is the simplest one:ls -alth | awk '{print $5}'
extracts the 5th whitespace-separated field ($5
), which is the size.Rather than use
-h
first and then reconvert the human-readable suffixes (such asB
,M
, andG
) back to the mere byte counts (incidentally, the multipliers must be multiples of1024
, not1000
), simply omit-h
from thels
command, which outputs the raw byte counts by default:ls -alt | awk '{print $5}'
回答2:
Alternative to the awk solution that will treat whitespace correctly , one can also use the find
utility that can provide results similar to ls
.
Actually you can use find
to display directly size of the results without the need of any other tool/pipe like cut
or awk
.
So, to list mere bytes you can use:
$ find . -maxdepth 1 -printf %s\\n
173
3
684
You can combine filename + bytes in find with
$ find . -maxdepth 1 -printf %f-%s\\n
bsd.txt-173
file4-3
shellcolors.sh-684
You can consult man find
to see a lot of available options under -printf
.
Moreover, by removing -maxdepth
option you can also have a listing of all the files in the subdirectories.
One more alternative is to use du
utility, that is capable to provide results in human readable format:
$ du -a -b -h -d1
1.9M ./appsfiles
173 ./bsd.txt
3 ./file4
684 ./shellcolors.sh
-a
: all files and directories. Remove this option to get only directories size-b
: Reports the real size of file - Removing this option will report the disk size occupied by this file (i.e a file of 3 kB occupies 4K in reality)-h
: human readable size-d1
: depth1
You can further parse the results of du with |cut -d" " -f1
or with |awk '{print $1}'
回答3:
I was getting annoyed with having to look up awk(ward) syntax and wrote my own:
https://www.npmjs.com/package/cutr
Install with
npm i -g cutr
ls --full-time | cutr -d ' +' -f 6-
or run with something like
ls --full-time | npx cutr -d ' +' -f 6-
Your command could be
ls -alth | cutr -f 5 -d '\s+'
来源:https://stackoverflow.com/questions/43312360/how-to-use-regex-with-cut-at-the-command-line