Linux: fast creating of formatted output file (csv) from find command

问题

I have several devices, which I want to collect in a list (csv) to put them into a mysql database. I began with a device with the goal to create a new formatted output file, from infile file which was created with 'find'. The device is /mnt/sda4 and I have skipped all entries containing '.cache'. I also have already cut /mnt/sda4/:

find /mnt/sda4 | grep -v '.cache' | cut -d'/' -f4- > infile

where the infile is like that:

Extern-500GB-btrfs/root/usr/lib64/libreoffice/share/config/soffice.cfg/dbaccess/ui/mysqlnativesettings.ui

Extern-500GB-btrfs/root/usr/lib64/libreoffice/share/config/soffice.cfg/dbaccess/ui/namematchingpage.ui

...

This part is very fast.

real    0m1,432s
user    0m1,079s
sys     0m0,873s

Now, I have two solutions, both (very) slow, and I would like to have a new output list with the following of every line processed; "06;" basename ";/" whole line processed, like this:

06;mysqlnativesettings.ui;/Extern-500GB-btrfs/root/usr/lib64/libreoffice/share/config/soffice.cfg/dbaccess/ui/mysqlnativesettings.ui

06;namematchingpage.ui;/Extern-500GB-btrfs/root/usr/lib64/libreoffice/share/config/soffice.cfg/dbaccess/ui/namematchingpage.ui
...

time while read p; do bn=$(basename "$p"); echo "06;""$bn"";/""$p" >> outfile.csv; done < infile

The time need for this, is:

real    27m44,937s
user    10m4,539s
sys     18m6,491s

I made another try with one command line combining find and formatting the same time:

time find /mnt/sda4/ | while read p; do g=$(echo $p | grep -c -v '\.cache'); case "$g" in 1) echo "06;$(basename "$p")"';/'$(cut -d'/' -f4- <<<"$p") >>outfile.csv;; *) : ;; esac; done

I forgot the time for this, but it took a long time also.

So, my question is: Is there a (much) faster way to create the second table, maybe directly when working with find?

Thank you in advance,

-Linuxfluesterer

回答1:

I guess the problem is with the looping and all the redirections; have you considered using awk? I think the following should do all you need - I obviously don't have your directory structure to test with, though - and be reasonably quick.

time find /mnt/sda4/ | awk 'BEGIN{FS=OFS="/"}!/.cache/ {$2=$3=""; new=sprintf("%s",$0);gsub(/^\/\/\//,"",new); printf "06;%s;/%s\n",$NF,new }' > outfile.csv

来源：https://stackoverflow.com/questions/64125084/linux-fast-creating-of-formatted-output-file-csv-from-find-command

标签

Linux

format

find