问题
I am trying to perform a rotate left (ROTL) operation on an arbitrary amount of rotations on a 64-bit word currently represented as a byte array of 8 bytes in a JavaCard smart card.
The ugly way around is to hard-code all 64 possible permutations of ROTL on a 64-bit word represented as an 8 byte array but that will simply bloat the entire codebase up.
How do I make it leaner so that I can on-the-fly do any arbitrary amount of ROTL operations on demand on a 64-bit word (in byte array) with the use of byte
and short
types only (due to JavaCard not recognizing more complex things like int
or long
and so on) without need for hard-coding all ROTL64 permutations.
回答1:
The following method performs rotate right for any kind of array in a buffer, without requiring an additional output or temporary array:
/**
* Rotates the indicated bytes in the given buffer to the right by a certain
* number of bits.
*
* @param buf
* the buffer in which the bits need to be rotated
* @param off
* the offset in the buffer where the rotation needs to start
* @param len
* the amount of bytes
* @param rot
* the amount to rotate (any 31 bit value allowed)
*/
public static void rotr(byte[] buf, short off, short len, short rot) {
if (len == 0) {
// nothing to rotate (avoid division by 0)
return;
}
final short lenBits = (short) (len * BYTE_SIZE);
// doesn't always work for edge cases close to MIN_SHORT / MAX_SHORT
rot = (short) ((rot + lenBits) % lenBits);
// reused variables for byte and bit shift
short shift, i;
byte t1, t2;
// --- byte shift
shift = (short) (rot / BYTE_SIZE);
// only shift when actually required
if (shift != 0) {
// values will never be used, src == start at the beginning
short start = -1, src = -1, dest;
// compiler is too stupid to see t1 will be assigned anyway
t1 = 0;
// go over all the bytes, but in stepwise fashion
for (i = 0; i < len; i++) {
// if we get back to the start
// ... then we need to continue one position to the right
if (src == start) {
start++;
t1 = buf[(short) (off + (++src))];
}
// calculate next location by stepping by the shift amount
// ... modulus the length of course
dest = (short) ((src + shift) % len);
// save value, this will be the next one to be stored
t2 = buf[(short) (off + dest)];
// store value, doing the actual shift
buf[(short) (off + dest)] = t1;
// do the step
src = dest;
// we're going to store t1, not t2
t1 = t2;
}
}
// --- bit shift
shift = (short) (rot % BYTE_SIZE);
// only shift when actually required
if (shift != 0) {
// t1 holds previous byte, at other side
t1 = buf[(short) (off + len - 1)];
for (i = 0; i < len; i++) {
t2 = buf[(short) (off + i)];
// take bits from previous byte and this byte together
buf[(short) (off + i)] = (byte) ((t1 << (BYTE_SIZE - shift)) | ((t2 & BYTE_MASK) >> shift));
// current byte is now previous byte
t1 = t2;
}
}
}
private static final short BYTE_MASK = 0xFF;
private static final short BYTE_SIZE = 8;
A drawback is that it requires two passes over all the data: one of the byte shift and one for the bit shift. It skips them when they are not required (if you know that the skipping is never performed then you can easily remove those checks).
Here is the left-rotate. The left rotate is somewhat harder to implement itself - requiring a few more calculations, so the cost of the additional method call may be offset against that. If you rotate by using a literal then you can just use the rotr
function of course, or calculate the rotation amount yourself.
public static void rotl(byte[] buf, short off, short len, short bits) {
final short lenBits = (short) (len * BYTE_SIZE);
bits = (short) ((bits + lenBits) % lenBits);
// we don't care if we pass 0 or lenBits, rotr will adjust
rotr(buf, off, len, (short) (lenBits - bits));
}
Note: A previous version required rotates over 64 bit, was more time constant and didn't have the offset included. It also required a 64 bit specific array with loop constants (which is now replaced by the much more generic inner if
statement in the for
loop of the byte shift). See the edits for other versions.
Rotating gets much easier when an output buffer is available: this implementation can consist of just the initialization part and the last 4 lines of code. Crypto code often only shifts by a constant size and by odd numbers, so just the last 4 lines can then be used, skipping the optimizations in case no (bit) shift needs to be performed.
I've deliberately used a slightly different interface for that which assumes 64 bits for the rotate, just to show a slightly different implementation.
public static void rotr64(byte[] inBuf, short inOff, byte[] outBuf, short outOff, short rot) {
short byteRot = (short) ((rot & 0b00111000) >> 3);
short bitRot = (short) (rot & 0b00000111);
if (bitRot == 0) {
if (byteRot == 0) {
// --- no rotation
return;
}
// --- only byte rotation
for (short i = 0; i < LONG_BYTES; i++) {
outBuf[(short) (outOff + (i + byteRot) % LONG_BYTES)] = inBuf[(short) (inOff + i)];
}
} else {
// --- bit- and possibly byte rotation
// note: also works for all other situations, but slower
// put the last byte in t_lo
short t = (short) (inBuf[inOff + LONG_BYTES - 1] & BYTE_MASK);
for (short i = 0; i < LONG_BYTES; i++) {
// shift t_lo into t_hi and add the next byte into t_lo
t = (short) (t << BYTE_SIZE | (inBuf[(short) (inOff + i)] & BYTE_MASK));
// find the byte to receive the shifted value within the short
outBuf[(short) (outOff + (i + byteRot) % LONG_BYTES)] = (byte) (t >> bitRot);
}
}
}
private static final int LONG_BYTES = 8;
private static final short BYTE_MASK = 0xFF;
private static final short BYTE_SIZE = 8;
and it can be further simplified if the offset is always set to zero.
Here is the left rotate, in case you need a generic function for that:
public static void rotl64(byte[] inBuf, short inOff, byte[] outBuf, short outOff, short rot) {
rotr64(inBuf, inOff, outBuf, outOff, (short) (64 - rot & 0b00111111));
}
Everything is tested against random input (a million runs or so, which takes less than a second on Java SE) although I do not offer warranty on the testing; please test yourself.
回答2:
A very simple implementation which receives the four shorts in separate parameters
public static void rotateRight64(short x3, short x2, short x1, short x0,
short rotAmount, short[] out)
{
assert out.length() == 4;
rotAmount &= (1 << 6) - 1; // limit the range to 0..63
if (rotAmount >= 16)
rotateRight64(x0, x3, x2, x1, rotAmount - 16, out);
else
{
out[0] = (short)((x0 >>> rotAmount) | (x1 << (16-rotAmount)));
out[1] = (short)((x1 >>> rotAmount) | (x2 << (16-rotAmount)));
out[2] = (short)((x2 >>> rotAmount) | (x3 << (16-rotAmount)));
out[3] = (short)((x3 >>> rotAmount) | (x0 << (16-rotAmount)));
}
}
It's a rotate right but it's easy to do a rotate left by rotating right 64 - rotAmount
Alternatively it can be done like this without the coarse shifting step
public static void rotateRight(short[] out, short[] in, short rotAmount) // in ror rotAmount
{
assert out.length() == 4 && in.length() == 4 && rotAmount >= 0 && rotAmount < 64;
const short shift = (short)(rotAmount % 16);
const short limbshift = (short)(rotAmount / 16);
short tmp = in[0];
for (short i = 0; i < 4; ++i)
{
short index = (short)((i + limbshift) % 4);
out[i] = (short)((in[index] >>> shift) | (in[index + 1] << (16 - shift)));
}
}
This way it can be easily changed to an arbitrary-precision shift/rotate
If the input array is byte
then you can change short[4]
to byte[8]
and change all the constants from 16 → 8 and from 4 → 8. In fact they can be generalized without problem, I'm just hard coding to keep it simple and easier to understand
来源:https://stackoverflow.com/questions/52552322/rotate-left-on-a-64-bit-word-byte-array-in-javacard