Just say I have a value of type uint64_t
seen as sequence of octets (1 octet = 8-bit). The uint64_t
value is known containing only one set bit<
Multiply the value by a carefully designed 64-bit constant, then mask off the upper 4 bits. For any CPU with fast 64-bit multiplication, this is probably as optimal as you can get.
int field_set(uint64_t input) {
uint64_t field = input * 0x20406080a0c0e1ULL;
return (field >> 60) & 15;
}
// field_set(0x0000000000000000ULL) = 0
// field_set(0x0000000000000080ULL) = 1
// field_set(0x0000000000008000ULL) = 2
// field_set(0x0000000000800000ULL) = 3
// field_set(0x0000000080000000ULL) = 4
// field_set(0x0000008000000000ULL) = 5
// field_set(0x0000800000000000ULL) = 6
// field_set(0x0080000000000000ULL) = 7
// field_set(0x8000000000000000ULL) = 8
clang implements this in three x86_64 instructions, not counting the frame setup and cleanup:
_field_set:
push %rbp
mov %rsp,%rbp
movabs $0x20406080a0c0e1,%rax
imul %rdi,%rax
shr $0x3c,%rax
pop %rbp
retq
Note that the results for any other input will be pretty much random. (So don't do that.)
I don't think there's any feasible way to extend this method to return values in the 7..63 range directly (the structure of the constant doesn't permit it), but you can convert the results to that range by multiplying the result by 7.
With regard to how this constant was designed: I started with the following observations:
1ULL<<63
(i.e, your "pos=63" value) can only possibly result in the same value, or zero. (It cannot possibly have any lower bits set, and there are no higher bits to change.) Therefore, we must find some way for this value to be treated as the correct result.Multiplying our constant by each of the other bit fields is equivalent to left-shifting it by a number of bits equal to its "position". The right-shift by 60 bits causes only the 4 bits to the left of a given position to appear in the result. Thus, we can create all of the cases except for one as follows:
uint64_t constant = (
1ULL << (60 - 7)
| 2ULL << (60 - 15)
| 3ULL << (60 - 23)
| 4ULL << (60 - 31)
| 5ULL << (60 - 39)
| 6ULL << (60 - 47)
| 7ULL << (60 - 55)
);
So far, the constant is 0x20406080a0c0e0ULL
. However, this doesn't give the right result for pos=63
; this constant is even, so multiplying it by that input gives zero. We must set the lowest bit (i.e, constant |= 1ULL
) to get that case to work, giving us the final value of 0x20406080a0c0e1ULL
.
Note that the construction above can be modified to encode the results differently. However, the output of 8
is fixed as described above, and all other output must fit into 4 bits (i.e, 0 to 15).