There are a lot of questions about this on StackOverflow. A lot. However I cannot find an answer that:
.NET Core 3.0 added BitOperations.LeadingZeroCount and BitOperations.TrailingZeroCount so you can use them directly. They'll be mapped to the x86's LZCNT/BSR and TZCNT/BSF instructions, hence extremely efficient
int mostSignificantPosition = 63 - BitOperations.LeadingZeroCount(0x1234L);
int leastSignificantPosition = BitOperations.TrailingZeroCount(0x1234L);
As per my comment, this is a function in C# to count leading zero bits modified for a 64 bit integer.
public static UInt64 CountLeadingZeros(UInt64 input)
{
if (input == 0) return 64;
UInt64 n = 1;
if ((input >> 32) == 0) { n = n + 32; input = input << 32; }
if ((input >> 48) == 0) { n = n + 16; input = input << 16; }
if ((input >> 56) == 0) { n = n + 8; input = input << 8; }
if ((input >> 60) == 0) { n = n + 4; input = input << 4; }
if ((input >> 62) == 0) { n = n + 2; input = input << 2; }
n = n - (input >> 63);
return n;
}
UPDATE:
If you're using a newer version of C#, check to see if this is built-in per the answer below.
https://stackoverflow.com/a/61141435/1587755
The fastest way to get most significant bit at IL code should be a float
conversion and accessing the exponent bits.
Save code:
int myint = 7;
int msb = (BitConverter.SingleToInt32Bits(myint) >> 23) - 0x7f;
An even faster way would be the msb
and lsb
cpu instructions. As mentioned by phuclv it got availibele at .Net Core 3.0 so I affffded a test that is unfortunately not much faster.
As requested here are the BenchmarkDotNet results for 10000 coverts of uint
and ulong
. The speedup was factor 2 so the BitScanner solution is fast, but can't beat native float conversion.
Method | Mean | Error | StdDev | Ratio
BitScannerForward | 34.37 us | 0.420 us | 0.372 us | 1.00
BitConverterULong | 18.59 us | 0.238 us | 0.223 us | 0.54
BitConverterUInt | 18.58 us | 0.129 us | 0.121 us | 0.54
NtdllMsbCall | 31.34 us | 0.204 us | 0.170 us | 0.91
LeadingZeroCount | 15.85 us | 0.169 us | 0.150 us | 0.48
Since we're talking about .NET here, it's usually preferable not to resort to external native calls. But if you can tolerate the overhead of a managed/unmanaged roundtrip for each operation, the following two calls provide pretty direct and unadulterated access to the native CPU instructions.
The (minimalistic) disassembly of the respective entire functions from ntdll.dll
are also shown for each. That library will be present on any Windows machine, and will always be found, if referenced as shown.
Least-significant bit (LSB):
[DllImport("ntdll"), SuppressUnmanagedCodeSecurity]
public static extern int RtlFindLeastSignificantBit(ulong ul);
// X64:
// bsf rdx, rcx
// mov eax, 0FFFFFFFFh
// movzx ecx, dl
// cmovne eax,ecx
// ret
Most-significant bit (MSB):
[DllImport("ntdll"), SuppressUnmanagedCodeSecurity]
public static extern int RtlFindMostSignificantBit(ulong ul);
// X64:
// bsr rdx, rcx
// mov eax, 0FFFFFFFFh
// movzx ecx, dl
// cmovne eax,ecx
// ret
Usage:
Here's a usage example which requires that the above declarations be accessible. Couldn't be simpler.
int ix;
ix = RtlFindLeastSignificantBit(0x00103F0A042C1D80UL); // ix --> 7
ix = RtlFindMostSignificantBit(0x00103F0A042C1D80UL); // ix --> 52
One of the ways of doing this, that is described on the Bit Hacks page linked in the question is leveraging De Bruijn sequence. Unfortunately this page does not give a 64-bit version of said sequence. This useful page explains how De Bruijn sequences can be constructed, and this one gives an example of the sequence generator written in C++. If we adapt the given code we can generated multiple sequences, one of which is given in the C# code below:
public static class BitScanner
{
private const ulong Magic = 0x37E84A99DAE458F;
private static readonly int[] MagicTable =
{
0, 1, 17, 2, 18, 50, 3, 57,
47, 19, 22, 51, 29, 4, 33, 58,
15, 48, 20, 27, 25, 23, 52, 41,
54, 30, 38, 5, 43, 34, 59, 8,
63, 16, 49, 56, 46, 21, 28, 32,
14, 26, 24, 40, 53, 37, 42, 7,
62, 55, 45, 31, 13, 39, 36, 6,
61, 44, 12, 35, 60, 11, 10, 9,
};
public static int BitScanForward(ulong b)
{
return MagicTable[((ulong) ((long) b & -(long) b)*Magic) >> 58];
}
public static int BitScanReverse(ulong b)
{
b |= b >> 1;
b |= b >> 2;
b |= b >> 4;
b |= b >> 8;
b |= b >> 16;
b |= b >> 32;
b = b & ~(b >> 1);
return MagicTable[b*Magic >> 58];
}
}
I also posted my C# port of the sequence generator to github
Another related article not mentioned in the question with decent cover of De Bruijn sequences, can be found here.
@Taekahn gave a great answer. I'll just improve upon it slightly:
[System.Runtime.CompilerServices.MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int CountLeadingZeros(this ulong input)
{
const int bits = 64;
// if (input == 0L) return bits; // Not needed. Use only if 0 is very common.
int n = 1;
if ((input >> (bits - 32)) == 0) { n += 32; input <<= 32; }
if ((input >> (bits - 16)) == 0) { n += 16; input <<= 16; }
if ((input >> (bits - 8)) == 0) { n += 8; input <<= 8; }
if ((input >> (bits - 4)) == 0) { n += 4; input <<= 4; }
if ((input >> (bits - 2)) == 0) { n += 2; input <<= 2; }
return n - (int)(input >> (bits - 1));
}
There is no need to compute any of the "(bits - x)" at runtime, so the compiler should precompute them. Thus the increased readability comes at no cost.
Edit: As pointed out by @Peter Cordes, you should probably just use System.Numerics.BitOperations.LeadingZeroCount if you have the BitOperations class available. I, for one, often do not.