I have a large collection of documents scanned into PDF format, and I wish to write a shell script that will convert each document to DjVu format. Some documents were scanned a
pdfimages
has a -list
option that gives the height width in pixels and also y-ppi
and x-ppi
.
pdfimages -list tmp.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
1 0 image 3300 2550 gray 1 1 ccitt no 477 0 389 232 172K 17%
2 1 image 3300 2550 gray 1 1 ccitt no 3 0 389 232 103K 10%
3 2 image 3300 2550 gray 1 1 ccitt no 7 0 389 232 236K 23%
4 3 image 3300 2550 gray 1 1 ccitt no 11 0 389 232 210K 20%
5 4 image 3300 2550 gray 1 1 ccitt no 15 0 389 232 250K 24%
6 5 image 3300 2550 gray 1 1 ccitt no 19 0 389 232 199K 19%
7 6 image 3300 2550 gray 1 1 ccitt no 23 0 389 232 503K 49%
8 7 image 3300 2550 gray 1 1 ccitt no 27 0 389 232 154K 15%
9 8 image 3300 2550 gray 1 1 ccitt no 31 0 389 232 21.5K 2.1%
10 9 image 3300 2550 gray 1 1 ccitt no 35 0 389 232 286K 28%
11 10 image 3300 2550 gray 1 1 ccitt no 39 0 389 232 46.8K 4.6%
12 11 image 3300 2550 gray 1 1 ccitt no 43 0 389 232 55.5K 5.4%
13 12 image 3300 2550 gray 1 1 ccitt no 47 0 389 232 35.0K 3.4%
14 13 image 3300 2550 gray 1 1 ccitt no 51 0 389 232 26.9K 2.6%
15 14 image 3300 2550 gray 1 1 ccitt no 55 0 389 232 66.5K 6.5%
16 15 image 3300 2550 gray 1 1 ccitt no 59 0 389 232 73.9K 7.2%
17 16 image 3300 2550 gray 1 1 ccitt no 63 0 389 232 47.0K 4.6%
18 17 image 3300 2550 gray 1 1 ccitt no 67 0 389 232 30.1K 2.9%
19 18 image 3300 2550 gray 1 1 ccitt no 71 0 389 232 70.3K 6.8%
20 19 image 3300 2550 gray 1 1 ccitt no 75 0 389 232 46.0K 4.5%
21 20 image 3300 2550 gray 1 1 ccitt no 79 0 389 232 28.9K 2.8%
22 21 image 3300 2550 gray 1 1 ccitt no 83 0 389 232 72.7K 7.1%
23 22 image 3300 2550 gray 1 1 ccitt no 87 0 389 232 47.5K 4.6%
24 23 image 3300 2550 gray 1 1 ccitt no 91 0 389 232 30.1K 2.9%