Quickest way to convert a base 10 number to any base in .NET?

后端 未结 12 1225
予麋鹿
予麋鹿 2020-11-22 04:07

I have and old(ish) C# method I wrote that takes a number and converts it to any base:

string ConvertToBase(int number, char[] baseChars);

相关标签:
12条回答
  • 2020-11-22 04:30

    I was using this to store a Guid as a shorter string (but was limited to use 106 characters). If anyone is interested here is my code for decoding the string back to numeric value (in this case I used 2 ulongs for the Guid value, rather than coding an Int128 (since I'm in 3.5 not 4.0). For clarity CODE is a string const with 106 unique chars. ConvertLongsToBytes is pretty unexciting.

    private static Guid B106ToGuid(string pStr)
        {
            try
            {
                ulong tMutl = 1, tL1 = 0, tL2 = 0, targetBase = (ulong)CODE.Length;
                for (int i = 0; i < pStr.Length / 2; i++)
                {
                    tL1 += (ulong)CODE.IndexOf(pStr[i]) * tMutl;
                    tL2 += (ulong)CODE.IndexOf(pStr[pStr.Length / 2 + i]) * tMutl;
                    tMutl *= targetBase;
                }
                return new Guid(ConvertLongsToBytes(tL1, tL2));
            }
            catch (Exception ex)
            {
                throw new Exception("B106ToGuid failed to convert string to Guid", ex);
            }
        }
    
    0 讨论(0)
  • 2020-11-22 04:31

    Very late to the party on this one, but I wrote the following helper class recently for a project at work. It was designed to convert short strings into numbers and back again (a simplistic perfect hash function), however it will also perform number conversion between arbitrary bases. The Base10ToString method implementation answers the question that was originally posted.

    The shouldSupportRoundTripping flag passed to the class constructor is needed to prevent the loss of leading digits from the number string during conversion to base-10 and back again (crucial, given my requirements!). Most of the time the loss of leading 0s from the number string probably won't be an issue.

    Anyway, here's the code:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    
    namespace StackOverflow
    {
        /// <summary>
        /// Contains methods used to convert numbers between base-10 and another numbering system.
        /// </summary>
        /// <remarks>
        /// <para>
        /// This conversion class makes use of a set of characters that represent the digits used by the target
        /// numbering system. For example, binary would use the digits 0 and 1, whereas hex would use the digits
        /// 0 through 9 plus A through F. The digits do not have to be numerals.
        /// </para>
        /// <para>
        /// The first digit in the sequence has special significance. If the number passed to the
        /// <see cref="StringToBase10"/> method has leading digits that match the first digit, then those leading
        /// digits will effectively be 'lost' during conversion. Much of the time this won't matter. For example,
        /// "0F" hex will be converted to 15 decimal, but when converted back to hex it will become simply "F",
        /// losing the leading "0". However, if the set of digits was A through Z, and the number "ABC" was
        /// converted to base-10 and back again, then the leading "A" would be lost. The <see cref="System.Boolean"/>
        /// flag passed to the constructor allows 'round-tripping' behaviour to be supported, which will prevent
        /// leading digits from being lost during conversion.
        /// </para>
        /// <para>
        /// Note that numeric overflow is probable when using longer strings and larger digit sets.
        /// </para>
        /// </remarks>
        public class Base10Converter
        {
            const char NullDigit = '\0';
    
            public Base10Converter(string digits, bool shouldSupportRoundTripping = false)
                : this(digits.ToCharArray(), shouldSupportRoundTripping)
            {
            }
    
            public Base10Converter(IEnumerable<char> digits, bool shouldSupportRoundTripping = false)
            {
                if (digits == null)
                {
                    throw new ArgumentNullException("digits");
                }
    
                if (digits.Count() == 0)
                {
                    throw new ArgumentException(
                        message: "The sequence is empty.",
                        paramName: "digits"
                        );
                }
    
                if (!digits.Distinct().SequenceEqual(digits))
                {
                    throw new ArgumentException(
                        message: "There are duplicate characters in the sequence.",
                        paramName: "digits"
                        );
                }
    
                if (shouldSupportRoundTripping)
                {
                    digits = (new[] { NullDigit }).Concat(digits);
                }
    
                _digitToIndexMap =
                    digits
                    .Select((digit, index) => new { digit, index })
                    .ToDictionary(keySelector: x => x.digit, elementSelector: x => x.index);
    
                _radix = _digitToIndexMap.Count;
    
                _indexToDigitMap =
                    _digitToIndexMap
                    .ToDictionary(keySelector: x => x.Value, elementSelector: x => x.Key);
            }
    
            readonly Dictionary<char, int> _digitToIndexMap;
            readonly Dictionary<int, char> _indexToDigitMap;
            readonly int _radix;
    
            public long StringToBase10(string number)
            {
                Func<char, int, long> selector =
                    (c, i) =>
                    {
                        int power = number.Length - i - 1;
    
                        int digitIndex;
                        if (!_digitToIndexMap.TryGetValue(c, out digitIndex))
                        {
                            throw new ArgumentException(
                                message: String.Format("Number contains an invalid digit '{0}' at position {1}.", c, i),
                                paramName: "number"
                                );
                        }
    
                        return Convert.ToInt64(digitIndex * Math.Pow(_radix, power));
                    };
    
                return number.Select(selector).Sum();
            }
    
            public string Base10ToString(long number)
            {
                if (number < 0)
                {
                    throw new ArgumentOutOfRangeException(
                        message: "Value cannot be negative.",
                        paramName: "number"
                        );
                }
    
                string text = string.Empty;
    
                long remainder;
                do
                {
                    number = Math.DivRem(number, _radix, out remainder);
    
                    char digit;
                    if (!_indexToDigitMap.TryGetValue((int) remainder, out digit) || digit == NullDigit)
                    {
                        throw new ArgumentException(
                            message: "Value cannot be converted given the set of digits used by this converter.",
                            paramName: "number"
                            );
                    }
    
                    text = digit + text;
                }
                while (number > 0);
    
                return text;
            }
        }
    }
    

    This can also be subclassed to derive custom number converters:

    namespace StackOverflow
    {
        public sealed class BinaryNumberConverter : Base10Converter
        {
            public BinaryNumberConverter()
                : base(digits: "01", shouldSupportRoundTripping: false)
            {
            }
        }
    
        public sealed class HexNumberConverter : Base10Converter
        {
            public HexNumberConverter()
                : base(digits: "0123456789ABCDEF", shouldSupportRoundTripping: false)
            {
            }
        }
    }
    

    And the code would be used like this:

    using System.Diagnostics;
    
    namespace StackOverflow
    {
        class Program
        {
            static void Main(string[] args)
            {
                {
                    var converter = new Base10Converter(
                        digits: "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz",
                        shouldSupportRoundTripping: true
                        );
    
                    long number = converter.StringToBase10("Atoz");
                    string text = converter.Base10ToString(number);
                    Debug.Assert(text == "Atoz");
                }
    
                {
                    var converter = new HexNumberConverter();
    
                    string text = converter.Base10ToString(255);
                    long number = converter.StringToBase10(text);
                    Debug.Assert(number == 255);
                }
            }
        }
    }
    
    0 讨论(0)
  • 2020-11-22 04:32

    Convert.ToString can be used to convert a number to its equivalent string representation in a specified base.

    Example:

    string binary = Convert.ToString(5, 2); // convert 5 to its binary representation
    Console.WriteLine(binary);              // prints 101
    

    However, as pointed out by the comments, Convert.ToString only supports the following limited - but typically sufficient - set of bases: 2, 8, 10, or 16.

    Update (to meet the requirement to convert to any base):

    I'm not aware of any method in the BCL which is capable to convert numbers to any base so you would have to write your own small utility function. A simple sample would look like that (note that this surely can be made faster by replacing the string concatenation):

    class Program
    {
        static void Main(string[] args)
        {
            // convert to binary
            string binary = IntToString(42, new char[] { '0', '1' });
    
            // convert to hexadecimal
            string hex = IntToString(42, 
                new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
                             'A', 'B', 'C', 'D', 'E', 'F'});
    
            // convert to hexavigesimal (base 26, A-Z)
            string hexavigesimal = IntToString(42, 
                Enumerable.Range('A', 26).Select(x => (char)x).ToArray());
    
            // convert to sexagesimal
            string xx = IntToString(42, 
                new char[] { '0','1','2','3','4','5','6','7','8','9',
                'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
                'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x'});
        }
    
        public static string IntToString(int value, char[] baseChars)
        {
            string result = string.Empty;
            int targetBase = baseChars.Length;
    
            do
            {
                result = baseChars[value % targetBase] + result;
                value = value / targetBase;
            } 
            while (value > 0);
    
            return result;
        }
    
        /// <summary>
        /// An optimized method using an array as buffer instead of 
        /// string concatenation. This is faster for return values having 
        /// a length > 1.
        /// </summary>
        public static string IntToStringFast(int value, char[] baseChars)
        {
            // 32 is the worst cast buffer size for base 2 and int.MaxValue
            int i = 32;
            char[] buffer = new char[i];
            int targetBase= baseChars.Length;
    
            do
            {
                buffer[--i] = baseChars[value % targetBase];
                value = value / targetBase;
            }
            while (value > 0);
    
            char[] result = new char[32 - i];
            Array.Copy(buffer, i, result, 0, 32 - i);
    
            return new string(result);
        }
    }
    

    Update 2 (Performance Improvement)

    Using an array buffer instead of string concatenation to build the result string gives a performance improvement especially on large number (see method IntToStringFast). In the best case (i.e. the longest possible input) this method is roughly three times faster. However, for 1-digit numbers (i.e. 1-digit in the target base), IntToString will be faster.

    0 讨论(0)
  • 2020-11-22 04:33

    FAST "FROM" AND "TO" METHODS

    I am late to the party, but I compounded previous answers and improved over them. I think these two methods are faster than any others posted so far. I was able to convert 1,000,000 numbers from and to base 36 in under 400ms in a single core machine.

    Example below is for base 62. Change the BaseChars array to convert from and to any other base.

    private static readonly char[] BaseChars = 
             "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".ToCharArray();
    private static readonly Dictionary<char, int> CharValues = BaseChars
               .Select((c,i)=>new {Char=c, Index=i})
               .ToDictionary(c=>c.Char,c=>c.Index);
    
    public static string LongToBase(long value)
    {
       long targetBase = BaseChars.Length;
       // Determine exact number of characters to use.
       char[] buffer = new char[Math.Max( 
                  (int) Math.Ceiling(Math.Log(value + 1, targetBase)), 1)];
    
       var i = buffer.Length;
       do
       {
           buffer[--i] = BaseChars[value % targetBase];
           value = value / targetBase;
       }
       while (value > 0);
    
       return new string(buffer, i, buffer.Length - i);
    }
    
    public static long BaseToLong(string number) 
    { 
        char[] chrs = number.ToCharArray(); 
        int m = chrs.Length - 1; 
        int n = BaseChars.Length, x;
        long result = 0; 
        for (int i = 0; i < chrs.Length; i++)
        {
            x = CharValues[ chrs[i] ];
            result += x * (long)Math.Pow(n, m--);
        }
        return result;  
    } 
    

    EDIT (2018-07-12)

    Fixed to address the corner case found by @AdrianBotor (see comments) converting 46655 to base 36. This is caused by a small floating-point error calculating Math.Log(46656, 36) which is exactly 3, but .NET returns 3 + 4.44e-16, which causes an extra character in the output buffer.

    0 讨论(0)
  • 2020-11-22 04:39

    If anyone is seeking a VB option, this was based on Pavel's answer:

    Public Shared Function ToBase(base10 As Long, Optional baseChars As String = "0123456789ABCDEFGHIJKLMNOPQRTSUVWXYZ") As String
    
        If baseChars.Length < 2 Then Throw New ArgumentException("baseChars must be at least 2 chars long")
    
        If base10 = 0 Then Return baseChars(0)
    
        Dim isNegative = base10 < 0
        Dim radix = baseChars.Length
        Dim index As Integer = 64 'because it's how long a string will be if the basechars are 2 long (binary)
        Dim chars(index) As Char '65 chars, 64 from above plus one for sign if it's negative
    
        base10 = Math.Abs(base10)
    
    
        While base10 > 0
            chars(index) = baseChars(base10 Mod radix)
            base10 \= radix
    
            index -= 1
        End While
    
        If isNegative Then
            chars(index) = "-"c
            index -= 1
        End If
    
        Return New String(chars, index + 1, UBound(chars) - index)
    
    End Function
    
    0 讨论(0)
  • 2020-11-22 04:43

    I too was looking for a fast way to convert decimal number to another base in the range of [2..36] so I developed the following code. Its simple to follow and uses a Stringbuilder object as a proxy for a character buffer that we can index character by character. The code appears to be very fast compared to alternatives and a lot faster than initialising individual characters in a character array.

    For your own use you might prefer to: 1/ Return a blank string rather than throw an exception. 2/ remove the radix check to make the method run even faster 3/ Initialise the Stringbuilder object with 32 '0's and remove the the line result.Remove( 0, i );. This will cause the string to be returned with leading zeros and further increase the speed. 4/ Make the Stringbuilder object a static field within the class so no matter how many times the DecimalToBase method is called the Stringbuilder object is only initialised the once. If you do this change 3 above would no longer work.

    I hope someone finds this useful :)

    AtomicParadox

            static string DecimalToBase(int number, int radix)
        {
            // Check that the radix is between 2 and 36 inclusive
            if ( radix < 2 || radix > 36 )
                throw new ArgumentException("ConvertToBase(int number, int radix) - Radix must be between 2 and 36.");
    
            // Create a buffer large enough to hold the largest int value represented in binary digits 
            StringBuilder result = new StringBuilder("                                ");  // 32 spaces
    
            // The base conversion calculates the digits in reverse order so use
            // an index to point to the last unused space in our buffer
            int i = 32; 
    
            // Convert the number to the new base
            do
            {
                int remainder = number % radix;
                number = number / radix;
                if(remainder <= 9)
                    result[--i] = (char)(remainder + '0');  // Converts [0..9] to ASCII ['0'..'9']
                else
                    result[--i] = (char)(remainder + '7');  // Converts [10..36] to ASCII ['A'..'Z']
            } while ( number > 0 );
    
            // Remove the unwanted padding from the front of our buffer and return the result
            // Note i points to the last unused character in our buffer
            result.Remove( 0, i );
            return (result.ToString());
        }
    
    0 讨论(0)
提交回复
热议问题