I have a library of large images (8000x6000px ~13mb) for which I would like to generate multiple thumbnails of smaller sizes with widths of 3000px, 2000px, 1000px, 500px, 25
I made some images and did some tests so you can see the effect on the performance of various techniques.
I made the images to contain random, difficult-to-compress data at dimensions and filesizes to match yours, i.e.
convert -size 8000x6000 xc:gray +noise random -quality 35 image.jpg
then, ls
gives 13MB like this:
-rw-r--r-- 1 mark staff 13M 23 Aug 17:55 image.jpg
I made 128 such random images because that is nicely divisible by the 8 CPU cores on my machine - see parallel tests later.
Now for the methods...
Method 1
This is the naive method - you just create all the files you asked for, one after the other.
#!/bin/bash
for f in image*jpg; do
for w in 3000 2000 1000 500 250 100; do
convert $f -resize ${w}x res_${f}_${w}.jpg
done
done
Time: 26 mins 46 secs
Method 2
Here we only read each image once, but generate all output sizes from the one input image and it is considerably faster.
#!/bin/bash
for f in image*jpg; do
convert $f -resize 3000x -write res_${f}_3000.jpg \
-resize 2000x -write res_${f}_2000.jpg \
-resize 1000x -write res_${f}_1000.jpg \
-resize 500x -write res_${f}_500.jpg \
-resize 250x -write res_${f}_250.jpg \
-resize 100x res_${f}_100.jpg
done
Time: 6 min 17 secs
Method 3
Here we advise ImageMagick up-front that the largest image we are going to need is only 3000x2250 pixels, so it can use less memory and read fewer DCT levels in and do less I/O. This is called "shrink-on-load".
#!/bin/bash
for f in image*jpg; do
convert -define jpeg:size=3000x2250 $f \
-resize 3000x -write res_${f}_3000.jpg \
-resize 2000x -write res_${f}_2000.jpg \
-resize 1000x -write res_${f}_1000.jpg \
-resize 500x -write res_${f}_500.jpg \
-resize 250x -write res_${f}_250.jpg \
-resize 100x res_${f}_100.jpg
done
Time: 3 min 37 s
Just as an aside, to demonstrate the reduced time, I/O and memory needed when you tell ImageMagick up-front how big you are going to need an image up-front, compare these two commands, both reading one of your 8000x6000, 13MB images and both generating the same thumbnail:
/usr/bin/time -l convert image.jpg -resize 500x result.jpg 2>&1 | egrep "resident|real"
1.92 real 1.77 user 0.14 sys
415727616 maximum resident set size
i.e. 415 MB and 2 seconds
/usr/bin/time -l convert -define jpeg:size=500x500 image.jpg -resize 500x result.jpg 2>&1 | egrep "resident|real"
0.24 real 0.23 user 0.01 sys
23592960 maximum resident set size
i.e. 23 MB and 0.2 seconds - and the output image has the same contents and quality.
Method 4
Here we go all-out and use GNU Parallel as well as all the foregoing techniques to send your CPU's, fans and power consumption crazy!!!
#!/bin/bash
for f in image*jpg; do
cat<<EOF
convert -define jpeg:size=3000x2250 $f \
-resize 3000x -write res_${f}_3000.jpg \
-resize 2000x -write res_${f}_2000.jpg \
-resize 1000x -write res_${f}_1000.jpg \
-resize 500x -write res_${f}_500.jpg \
-resize 250x -write res_${f}_250.jpg \
-resize 100x res_${f}_100.jpg
EOF
done | parallel
Time: 56 seconds
In summary, we can reduce the processing time from 27 minutes to 56 seconds by avoiding unnecessarily reading the image and doing as many outputs per input as possible, by telling ImageMagick up front how much of the input image it needs to read and by using GNU Parallel to keep all your lovely CPU cores busy. HTH.