How can I use VIPS for image normalization?

问题

I want to normalize the exposure and color palettes of a set of images. For context, this is for training a neural net in image classification on medical images. I'm also doing this for hundreds of thousands of images, so efficiency is very important.

So far I've been using VIPS, specifically PyVIPS, and would prefer a solution using that library. After finding this answer and looking through the documentation, I tried

x = pyvips.Image.new_from_file('test.ndpi')
x = x.hist_norm()
x.write_to_file('test_normalized.tiff')

but that seems to always produce a pure-white image.

回答1:

You need hist_equal for histogram equalisation.

The main docs are here:

https://libvips.github.io/libvips/API/current/libvips-histogram.html

However, that will be extremely slow for large slide images. It will need to scan the whole slide once to build the histogram, then scan again to equalise it. It would be much faster to find the histogram of a low-res layer, then use that to equalise the high-res one.

For example:

#!/usr/bin/env python3

import sys
import pyvips

# open the slide image and get the number of layers ... we are not fetching 
# pixels, so this is quick
x = pyvips.Image.new_from_file(sys.argv[1])
levels = int(x.get("openslide.level-count"))

# find the histogram of the highest level ... again, this should be quick
x = pyvips.Image.new_from_file(sys.argv[1], 
                               level=levels - 1)
hist = x.hist_find()

# from that, compute the transform for histogram equalisation
equalise = hist.hist_cum().hist_norm()

# and use that on the full-res image
x = pyvips.Image.new_from_file(sys.argv[1])

x = x.maplut(equalise)

x.write_to_file(sys.argv[2])

Another factor is that histogram equalisation is non-linear, so it will distort lightness relationships. It can also distort colour relationships and make noise and compression artifacts look crazy. I tried that program on an image I have here:

$ ~/try/equal.py bild.ndpi[level=7] y.jpg

The stripes are from the slide scanner and the ugly fringes from compression.

I think I would instead find image max and min from the low-res level, then use them to do a simple linear stretch of pixel values.

Something like:

x = pyvips.Image.new_from_file(sys.argv[1])
levels = int(x.get("openslide.level-count"))
x = pyvips.Image.new_from_file(sys.argv[1],
                               level=levels - 1)
mn = x.min()
mx = x.max()
x = pyvips.Image.new_from_file(sys.argv[1])
x = (x - mn) * (256 / (mx - mn))
x.write_to_file(sys.argv[2])

Did you find the new Region feature in pyvips? It makes generating patches for training MUCH faster, up to 100x faster in some cases:

https://github.com/libvips/pyvips/issues/100#issuecomment-493960943

来源：https://stackoverflow.com/questions/58665477/how-can-i-use-vips-for-image-normalization

标签

python-3.x

image

image-processing

vips