问题
I've come across some code which has the bit masks 0xff
and 0xff00
or in 16 bit binary form 00000000 11111111
and 11111111 00000000
.
/**
* Function to check if the given string is in GZIP Format.
*
* @param inString String to check.
* @return True if GZIP Compressed otherwise false.
*/
public static boolean isStringCompressed(String inString)
{
try
{
byte[] bytes = inString.getBytes("ISO-8859-1");
int gzipHeader = ((int) bytes[0] & 0xff)
| ((bytes[1] << 8) & 0xff00);
return GZIPInputStream.GZIP_MAGIC == gzipHeader;
} catch (Exception e)
{
return false;
}
}
I'm trying to work out what the purpose of using these bit masks in this context (against a byte array). I can't see what difference it would make?
In the context of a GZip compressed string as this method seems to be written for the GZip magic number is 35615
, 8B1F
in Hex and 10001011 00011111
in binary.
Am I correct in thinking this swaps the bytes? So for example say my input string were \u001f\u008b
bytes[0] & 0xff00
bytes[0] = 1f = 00011111
& ff = 11111111
--------
= 00011111
bytes[1] << 8
bytes[1] = 8b = 10001011
<< 8 = 10001011 00000000
((bytes[1] << 8) & 0xff00)
= 10001011 00000000 & 0xff00
= 10001011 00000000
11111111 00000000 &
-------------------
10001011 00000000
So
00000000 00011111
10001011 00000000 |
-----------------
10001011 00011111 = 8B1F
To me it doesn't seem like the &
is doing anything to the original byte in both cases bytes[0] & 0xff
and (bytes[1] << 8) & 0xff00)
. What am I missing?
回答1:
int gzipHeader = ((int) bytes[0] & 0xff) | ((bytes[1] << 8) & 0xff00);
The type byte
is Java is signed. If you cast a byte
to an int
, its sign will be extended. The & 0xff
is to mask out the 1
bits that you get from sign extension, effectively treating the byte
as if it is unsigned.
Likewise for 0xff00
, except that the byte is first shifted 8 bits to the left.
So, what this does is:
- take the first byte,
bytes[0]
, cast it toint
and mask out the sign-extended bits (treating the byte as if it is unsigned) - take the second byte, cast it to
int
, shift it left by 8 bits, and mask out the sign-extended bits - combine the values with
|
Note that the shift left effectively swaps the bytes.
回答2:
This is a trick to overcome big-endian/little-endian issues. It is forcing the interpretation of the first two bytes as little-endian, i.e. [0]
contains the low
byte and [1]
contains the high
byte.
回答3:
Apparently the purpose is to read the first word of bytes
and store them in gzipHeader
by suitable masking and shifting. More precisely, the first part masks out exactly the first byte while the second part masks out the second byte, already shifted by 8 bits. The |
combines both bit masks to an int
.
The resulting value is compared against the defined value GZIPInputStream.GZIP_MAGIC
to determine if the first two bytes are the defined beginning of data compressed with gzip.
回答4:
byte
is a signed type. If you convert 0xff as a byte
to int
you get -1. If you actually want to get 255, mask after the conversion.
来源:https://stackoverflow.com/questions/30327937/bit-shifting-and-bit-mask-sample-code