Converting from signed char to unsigned char and back again?

后端未结

关注

 5  879

I\'m working with JNI and have an array of type jbyte, where jbyte is represented as an signed char i.e. ranging from -128 to 127. The jbytes represent image pixels. For ima

相关标签:

5条回答

遥遥无期

2020-11-28 19:50
Do you realize, that CLAMP255 returns 0 for v < 0 and 255 for v >= 0?
IMHO, CLAMP255 should be defined as:
```
#define CLAMP255(v) (v > 255 ? 255 : (v < 0 ? 0 : v))
```
Difference: If v is not greater than 255 and not less than 0: return v instead of 255
0 讨论(0)
发布评论:

提交评论
- 加载中...
一生所求

2020-11-28 19:59
There are two ways to interpret the input data; either -128 is the lowest value, and 127 is the highest (i.e. true signed data), or 0 is the lowest value, 127 is somewhere in the middle, and the next "higher" number is -128, with -1 being the "highest" value (that is, the most significant bit already got misinterpreted as a sign bit in a two's complement notation.

Assuming you mean the latter, the formally correct way is
```
signed char in = ...
unsigned char out = (in < 0)?(in + 256):in;
```
which at least gcc properly recognizes as a no-op.
0 讨论(0)
发布评论:

提交评论
- 加载中...
傲寒

2020-11-28 20:00
I'm not 100% sure that I understand your question, so tell me if I'm wrong.

If I got it right, you are reading jbytes that are technically signed chars, but really pixel values ranging from 0 to 255, and you're wondering how you should handle them without corrupting the values in the process.

Then, you should do the following:
- convert jbytes to unsigned char before doing anything else, this will definetly restore the pixel values you are trying to manipulate
- use a larger signed integer type, such as int while doing intermediate calculations, this to make sure that over- and underflows can be detected and dealt with (in particular, not casting to a signed type could force to compiler to promote every type to an unsigned type in which case you wouldn't be able to detect underflows later on)
- when assigning back to a jbyte, you'll want to clamp your value to the 0-255 range, convert to unsigned char and then convert again to signed char: I'm not certain the first conversion is strictly necessary, but you just can't be wrong if you do both
For example:
```
inline int fromJByte(jbyte pixel) {
    // cast to unsigned char re-interprets values as 0-255
    // cast to int will make intermediate calculations safer
    return static_cast<int>(static_cast<unsigned char>(pixel));
}

inline jbyte fromInt(int pixel) {
    if(pixel < 0)
        pixel = 0;

    if(pixel > 255)
        pixel = 255;

    return static_cast<jbyte>(static_cast<unsigned char>(pixel));
}

jbyte in = ...
int intermediate = fromJByte(in) + 30;
jbyte out = fromInt(intermediate);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
栀梦

2020-11-28 20:07

Yes this is safe.

The c language uses a feature called integer promotion to increase the number of bits in a value before performing calculations. Therefore your CLAMP255 macro will operate at integer (probably 32 bit) precision. The result is assigned to a jbyte, which reduces the integer precision back to 8 bits fit in to the jbyte.

0 讨论(0)
发布评论:

提交评论
- 加载中...
无人共我

2020-11-28 20:08
This is one of the reasons why C++ introduced the new cast style, which includes static_cast and reinterpret_cast

There's two things you can mean by saying conversion from signed to unsigned, you might mean that you wish the unsigned variable to contain the value of the signed variable modulo the maximum value of your unsigned type + 1. That is if your signed char has a value of -128 then CHAR_MAX+1 is added for a value of 128 and if it has a value of -1, then CHAR_MAX+1 is added for a value of 255, this is what is done by static_cast. On the other hand you might mean to interpret the bit value of the memory referenced by some variable to be interpreted as an unsigned byte, regardless of the signed integer representation used on the system, i.e. if it has bit value 0b10000000 it should evaluate to value 128, and 255 for bit value 0b11111111, this is accomplished with reinterpret_cast.

Now, for the two's complement representation this happens to be exactly the same thing, since -128 is represented as 0b10000000 and -1 is represented as 0b11111111 and likewise for all in between. However other computers (usually older architectures) may use different signed representation such as sign-and-magnitude or ones' complement. In ones' complement the 0b10000000 bitvalue would not be -128, but -127, so a static cast to unsigned char would make this 129, while a reinterpret_cast would make this 128. Additionally in ones' complement the 0b11111111 bitvalue would not be -1, but -0, (yes this value exists in ones' complement,) and would be converted to a value of 0 with a static_cast, but a value of 255 with a reinterpret_cast. Note that in the case of ones' complement the unsigned value of 128 can actually not be represented in a signed char, since it ranges from -127 to 127, due to the -0 value.

I have to say that the vast majority of computers will be using two's complement making the whole issue moot for just about anywhere your code will ever run. You will likely only ever see systems with anything other than two's complement in very old architectures, think '60s timeframe.

The syntax boils down to the following:
```
signed char x = -100;
unsigned char y;

y = (unsigned char)x;                    // C static
y = *(unsigned char*)(&x);               // C reinterpret
y = static_cast<unsigned char>(x);       // C++ static
y = reinterpret_cast<unsigned char&>(x); // C++ reinterpret
```
To do this in a nice C++ way with arrays:
```
jbyte memory_buffer[nr_pixels];
unsigned char* pixels = reinterpret_cast<unsigned char*>(memory_buffer);
```
or the C way:
```
unsigned char* pixels = (unsigned char*)memory_buffer;
```
0 讨论(0)
发布评论:

提交评论
- 加载中...