I am currently diving into Wavelets and are a bit confused about certain things.
First of all, this is NOT homework. Its for recreational coding only.
In order to gain a better understanding, I implemented the lifting-scheme for the LeGal 5/3 wavelet in C. As far as I can see, it seems to work. I can reverse it and the original images is reproduced correctly. In pseudo-code, my forward dwt looks like this:
// deinterleave splits the low band from the high band
// (e.g. 1 0 3 0 6 0 8 1 11 becomes 1 3 6 8 11 | 0 0 0 1)
for each row in image:
dwt1d(row)
deinterleave(row)
for each col in image:
dwt1d(col)
deinterleave(col)
But I struggle with a couple of things.
When applying the dwt, I get back my transformed image, but the values are out of range [0 - 255]. Therefore I store them in shorts. Some are negative, and some are very large. Now how can I display them in order to get those nicely looking images as shown here: (http://www.whydomath.org/node/wavlets/images/Largetoplevelwt.gif)? If I show my image in Matlab with imshow(image, []), then my output looks like this: http://i.imgur.com/dNaYwEE.jpg. So, do I have to do some transformation on my sub bands? If yes, can someone point me to a solution or tell me what to do?
In the literature, I see sometimes that the sub bands are ordered like this: [ LL LH; HL HH ] and sometimes like this: [ LL HL; LH HH ]. The latter, I see mostly when the paper is about JPEG2000 and is also what my algorithm produces. In Matlab however, when using the lwt2 function, it returns the former layout. I see this also when I compare my output with the output from Matlab. It seems as LH and HL are mixed up. How can that be? Does it matter? Does it have to do something with using lifting instead of convolution?
Does it actually matter if one does rows first and then columns or visa versa? I see no difference in my output when I switch the order. The only thing that would be different is that LH becomes HL and HL becomes LH. However, that doesn't resolve my second question because the output is the same. Its just notational I guess. So does it matter? I saw papers where they do col-row and others where they do row-col. Both with respect to JPEG2000.
Thanks a lot. If someone could shed some light on my issues then I'd be very grateful.
Kind Regards, Markus
I wrote a blog about building a WDR image compression system. You can read more here:
http://trueharmoniccolours.co.uk/Blog/
(You'll note that I'm not a very prolific blogger ;) ). It should contain all you need to implement your own C++ version of WDR image compression. If not feel free to fire me a message and ask!
Yes this is a really under documented "feature" from what I could work out. The value returned from the DWT is actually a short and requires the range -255 to +255. Now of course -255 is not directly renderable when using 8-bit colour. As such what people usually do is divide the value by 2 and add 128 for display (Don't forget display is just a tool for debugging). This way you move 0 to be a 128 and hence a "mid grey" on a greyscale image.
It doesn't really matter provided you do the inverse transform the same way you do the forward transform.
No it should make no difference. When implementing you decide where to write the destination pixel so you are free to write it where you like (to a completely different image for example).
Edit: Regarding your comment the Daub 5/3 lifting equation is as follows:
d = s[n + 1] - ((s[n + 0] + s[n + 2]) / 2);
So in the case of a source image of 255, 0, 255
... that would result in a d of -255
. Starting with 0, 255, 0
would give the maximum of 255
so you should definitely be in the range -255 to +255
or there is something wrong with your implementation.
来源:https://stackoverflow.com/questions/31361563/discrete-wavelet-transform-legal-5-3-with-lifting-negative-values-visualizing