Fastest way to separate the digits of an int into an array in .NET?

后端 未结 11 803
天涯浪人
天涯浪人 2021-01-31 20:07

I want to separate the digits of an integer, say 12345, into an array of bytes {1,2,3,4,5}, but I want the most performance effective way to do that, because my program does tha

相关标签:
11条回答
  • 2021-01-31 20:42

    Millions of times isn't that much.

    // input: int num >= 0
    List<byte> digits = new List<byte>();
    while (num > 0)
    {
       byte digit = (byte) (num % 10);
       digits.Insert(0, digit);  // Insert to preserve order
       num = num / 10;
    }
    
    // if you really want it as an array
    byte[] bytedata = digits.ToArray();
    

    Note that this could be changed to cope with negative numbers if you change byte to sbyte and test for num != 0.

    0 讨论(0)
  • 2021-01-31 20:44

    1 + Math.Log10(num) will give the number of digits without any searching/looping:

    public static byte[] Digits(int num)
    {
        int nDigits = 1 + Convert.ToInt32(Math.Floor(Math.Log10(num)));
        byte[] digits = new byte[nDigits];
        int index = nDigits - 1;
        while (num > 0) {
            byte digit = (byte) (num % 10);
            digits[index] = digit;
            num = num / 10;
            index = index - 1;
        }
        return digits;
    }
    

    Edit: Possibly prettier:

    public static byte[] Digits(int num)
    {
        int nDigits = 1 + Convert.ToInt32(Math.Floor(Math.Log10(num)));
        byte[] digits = new byte[nDigits];
    
        for(int i = nDigits - 1; i != 0; i--)
        {
            digits[i] = (byte)(num % 10);
            num = num / 10;
        }
        return digits;
    } 
    
    0 讨论(0)
  • 2021-01-31 20:47

    If you can get by with leading zeros it is much easier.

        void Test()
        { 
            // Note: 10 is the maximum number of digits.
            int[] xs = new int[10];
            System.Random r = new System.Random();
            for (int i=0; i < 10000000; ++i)
                Convert(xs, r.Next(int.MaxValue));
        }
    
        // Notice, I don't allocate and return an array each time.
        public void Convert(int[] digits, int val)
        {
            for (int i = 0; i < 10; ++i)
            {
                digits[10 - i - 1] = val % 10;
                val /= 10;
            }
        }
    

    EDIT: Here is a faster version. On my computer it tested faster than two of Jon Skeet's algorithms, except for his memoized version:

    static void Convert(int[] digits, int val)
    {
      digits[9] = val % 10; val /= 10;
      digits[8] = val % 10; val /= 10;
      digits[7] = val % 10; val /= 10;
      digits[6] = val % 10; val /= 10;
      digits[5] = val % 10; val /= 10;
      digits[4] = val % 10; val /= 10;
      digits[3] = val % 10; val /= 10;
      digits[2] = val % 10; val /= 10;
      digits[1] = val % 10; val /= 10;
      digits[0] = val % 10; val /= 10;     
    } 
    
    0 讨论(0)
  • 2021-01-31 20:49

    How about:

    public static int[] ConvertToArrayOfDigits(int value)
    {
        int size = DetermineDigitCount(value);
        int[] digits = new int[size];
        for (int index = size - 1; index >= 0; index--)
        {
            digits[index] = value % 10;
            value = value / 10;
        }
        return digits;
    }
    
    private static int DetermineDigitCount(int x)
    {
        // This bit could be optimised with a binary search
        return x < 10 ? 1
             : x < 100 ? 2
             : x < 1000 ? 3
             : x < 10000 ? 4
             : x < 100000 ? 5
             : x < 1000000 ? 6
             : x < 10000000 ? 7
             : x < 100000000 ? 8
             : x < 1000000000 ? 9
             : 10;
    }
    

    Note that this won't cope with negative numbers... do you need it to?

    EDIT: Here's a version which memoizes the results for under 10000, as suggested by Eric. If you can absolutely guarantee that you won't change the contents of the returned array, you could remove the Clone call. It also has the handy property of reducing the number of checks to determine the length of "large" numbers - and small numbers will only go through that code once anyway :)

    private static readonly int[][] memoizedResults = new int[10000][];
    
    public static int[] ConvertToArrayOfDigits(int value)
    {
        if (value < 10000)
        {
            int[] memoized = memoizedResults[value];
            if (memoized == null) {
                memoized = ConvertSmall(value);
                memoizedResults[value] = memoized;
            }
            return (int[]) memoized.Clone();
        }
        // We know that value >= 10000
        int size = value < 100000 ? 5
             : value < 1000000 ? 6
             : value < 10000000 ? 7
             : value < 100000000 ? 8
             : value < 1000000000 ? 9
             : 10;
    
        return ConvertWithSize(value, size);
    }
    
    private static int[] ConvertSmall(int value)
    {
        // We know that value < 10000
        int size = value < 10 ? 1
                 : value < 100 ? 2
                 : value < 1000 ? 3 : 4;
        return ConvertWithSize(value, size);
    }
    
    private static int[] ConvertWithSize(int value, int size)
    {
        int[] digits = new int[size];
        for (int index = size - 1; index >= 0; index--)
        {
            digits[index] = value % 10;
            value = value / 10;
        }
        return digits;
    }
    

    Note that this doesn't try to be thread-safe at the moment. You may need to add a memory barrier to make sure that the write to the memoized results isn't visible until the writes within the individual result are visible. I've given up trying to reason about these things unless I absolutely have to. I'm sure you can make it lock-free with effort, but you should really get someone very smart to do so if you really need to.

    EDIT: I've just realised that the "large" case can make use of the "small" case - split the large number into two small ones and use the memoised results. I'll give that a go after dinner and write a benchmark...

    EDIT: Okay, ready for a giant amount of code? I realised that for uniformly random numbers at least, you'll get "big" numbers much more often than small ones - so you need to optimise for that. Of course, that might not be the case for real data, but anyway... it means I now do my size tests in the opposite order, hoping for big numbers first.

    I've got a benchmark for the original code, the simple memoization, and then the extremely-unrolled code.

    Results (in ms):

    Simple: 3168
    SimpleMemo: 3061
    UnrolledMemo: 1204
    

    Code:

    using System;
    using System.Diagnostics;
    
    class DigitSplitting
    {
        static void Main()        
        {
            Test(Simple);
            Test(SimpleMemo);
            Test(UnrolledMemo);
        }
    
        const int Iterations = 10000000;
    
        static void Test(Func<int, int[]> candidate)
        {
            Random rng = new Random(0);
            Stopwatch sw = Stopwatch.StartNew();
            for (int i = 0; i < Iterations; i++)
            {
                candidate(rng.Next());
            }
            sw.Stop();
            Console.WriteLine("{0}: {1}",
                candidate.Method.Name, (int) sw.ElapsedMilliseconds);            
        }
    
        #region Simple
        static int[] Simple(int value)
        {
            int size = DetermineDigitCount(value);
            int[] digits = new int[size];
            for (int index = size - 1; index >= 0; index--)
            {
                digits[index] = value % 10;
                value = value / 10;
            }
            return digits;
        }
    
        private static int DetermineDigitCount(int x)
        {
            // This bit could be optimised with a binary search
            return x < 10 ? 1
                 : x < 100 ? 2
                 : x < 1000 ? 3
                 : x < 10000 ? 4
                 : x < 100000 ? 5
                 : x < 1000000 ? 6
                 : x < 10000000 ? 7
                 : x < 100000000 ? 8
                 : x < 1000000000 ? 9
                 : 10;
        }
        #endregion Simple    
    
        #region SimpleMemo
        private static readonly int[][] memoizedResults = new int[10000][];
    
        public static int[] SimpleMemo(int value)
        {
            if (value < 10000)
            {
                int[] memoized = memoizedResults[value];
                if (memoized == null) {
                    memoized = ConvertSmall(value);
                    memoizedResults[value] = memoized;
                }
                return (int[]) memoized.Clone();
            }
            // We know that value >= 10000
            int size = value >= 1000000000 ? 10
                     : value >= 100000000 ? 9
                     : value >= 10000000 ? 8
                     : value >= 1000000 ? 7
                     : value >= 100000 ? 6
                     : 5;
    
            return ConvertWithSize(value, size);
        }
    
        private static int[] ConvertSmall(int value)
        {
            // We know that value < 10000
            return value >= 1000 ? new[] { value / 1000, (value / 100) % 10,
                                               (value / 10) % 10, value % 10 }
                  : value >= 100 ? new[] { value / 100, (value / 10) % 10, 
                                             value % 10 }
                  : value >= 10 ? new[] { value / 10, value % 10 }
                  : new int[] { value };
        }
    
        private static int[] ConvertWithSize(int value, int size)
        {
            int[] digits = new int[size];
            for (int index = size - 1; index >= 0; index--)
            {
                digits[index] = value % 10;
                value = value / 10;
            }
            return digits;
        }
        #endregion
    
        #region UnrolledMemo
        private static readonly int[][] memoizedResults2 = new int[10000][];
        private static readonly int[][] memoizedResults3 = new int[10000][];
        static int[] UnrolledMemo(int value)
        {
            if (value < 10000)
            {
                return (int[]) UnclonedConvertSmall(value).Clone();
            }
            if (value >= 1000000000)
            {
                int[] ret = new int[10];
                int firstChunk = value / 100000000;
                ret[0] = firstChunk / 10;
                ret[1] = firstChunk % 10;
                value -= firstChunk * 100000000;
                int[] secondChunk = ConvertSize4(value / 10000);
                int[] thirdChunk = ConvertSize4(value % 10000);
                ret[2] = secondChunk[0];
                ret[3] = secondChunk[1];
                ret[4] = secondChunk[2];
                ret[5] = secondChunk[3];
                ret[6] = thirdChunk[0];
                ret[7] = thirdChunk[1];
                ret[8] = thirdChunk[2];
                ret[9] = thirdChunk[3];
                return ret;
            } 
            else if (value >= 100000000)
            {
                int[] ret = new int[9];
                int firstChunk = value / 100000000;
                ret[0] = firstChunk;
                value -= firstChunk * 100000000;
                int[] secondChunk = ConvertSize4(value / 10000);
                int[] thirdChunk = ConvertSize4(value % 10000);
                ret[1] = secondChunk[0];
                ret[2] = secondChunk[1];
                ret[3] = secondChunk[2];
                ret[4] = secondChunk[3];
                ret[5] = thirdChunk[0];
                ret[6] = thirdChunk[1];
                ret[7] = thirdChunk[2];
                ret[8] = thirdChunk[3];
                return ret;
            }
            else if (value >= 10000000)
            {
                int[] ret = new int[8];
                int[] firstChunk = ConvertSize4(value / 10000);
                int[] secondChunk = ConvertSize4(value % 10000);
                ret[0] = firstChunk[0];
                ret[1] = firstChunk[0];
                ret[2] = firstChunk[0];
                ret[3] = firstChunk[0];
                ret[4] = secondChunk[0];
                ret[5] = secondChunk[1];
                ret[6] = secondChunk[2];
                ret[7] = secondChunk[3];
                return ret;
            }
            else if (value >= 1000000)
            {
                int[] ret = new int[7];
                int[] firstChunk = ConvertSize4(value / 10000);
                int[] secondChunk = ConvertSize4(value % 10000);
                ret[0] = firstChunk[1];
                ret[1] = firstChunk[2];
                ret[2] = firstChunk[3];
                ret[3] = secondChunk[0];
                ret[4] = secondChunk[1];
                ret[5] = secondChunk[2];
                ret[6] = secondChunk[3];
                return ret;
            }
            else if (value >= 100000)
            {
                int[] ret = new int[6];
                int[] firstChunk = ConvertSize4(value / 10000);
                int[] secondChunk = ConvertSize4(value % 10000);
                ret[0] = firstChunk[2];
                ret[1] = firstChunk[3];
                ret[2] = secondChunk[0];
                ret[3] = secondChunk[1];
                ret[4] = secondChunk[2];
                ret[5] = secondChunk[3];
                return ret;
            }
            else
            {
                int[] ret = new int[5];
                int[] chunk = ConvertSize4(value % 10000);
                ret[0] = value / 10000;
                ret[1] = chunk[0];
                ret[2] = chunk[1];
                ret[3] = chunk[2];
                ret[4] = chunk[3];
                return ret;
            }
        }
    
        private static int[] UnclonedConvertSmall(int value)
        {
            int[] ret = memoizedResults2[value];
            if (ret == null)
            {
                ret = value >= 1000 ? new[] { value / 1000, (value / 100) % 10,
                                               (value / 10) % 10, value % 10 }
                  : value >= 100 ? new[] { value / 100, (value / 10) % 10, 
                                             value % 10 }
                  : value >= 10 ? new[] { value / 10, value % 10 }
                  : new int[] { value };
                memoizedResults2[value] = ret;
            }
            return ret;
        }
    
        private static int[] ConvertSize4(int value)
        {
            int[] ret = memoizedResults3[value];
            if (ret == null)
            {
                ret = new[] { value / 1000, (value / 100) % 10,
                             (value / 10) % 10, value % 10 };
                memoizedResults3[value] = ret;
            }
            return ret;
        }
        #endregion UnrolledMemo
    }
    
    0 讨论(0)
  • 2021-01-31 20:50

    Just for fun, here's a way to separate all the digits using just one C# statement. It works this way: the regular expression uses the string version of the number, splits apart its digits into a string array, and finally the outer ConvertAll method creates an int array from the string array.

        int num = 1234567890;
    
        int [] arrDigits = Array.ConvertAll<string, int>(
            System.Text.RegularExpressions.Regex.Split(num.ToString(), @"(?!^)(?!$)"),
            str => int.Parse(str)
            );
    
        // resulting array is [1,2,3,4,5,6,7,8,9,0]
    

    Efficiency-wise?... I'm unsure compared to some of the other fast answers I see here. Somebody would have to test it.

    0 讨论(0)
  • 2021-01-31 20:51

    A little loop unrolling perhaps?

    int num = 147483647;
    int nDigits = 1 + Convert.ToInt32(Math.Floor(Math.Log10(num)));
    byte[] array = new byte[10] {
                (byte)(num / 1000000000 % 10),
                (byte)(num / 100000000 % 10),
                (byte)(num / 10000000 % 10),
                (byte)(num / 1000000 % 10),
                (byte)(num / 100000 % 10),
                (byte)(num / 10000 % 10),
                (byte)(num / 1000 % 10),
                (byte)(num / 100 % 10),
                (byte)(num / 10 % 10),
                (byte)(num % 10)};
    byte[] digits;// = new byte[nDigits];
    digits = array.Skip(array.Length-nDigits).ToArray();
    

    Thanks above for the Log10 thingy.. ;)

    There's been some talk of benchmarking...

    I've fully unrolled the loops, and compared with the accepted memoized variant of Jons, and I get a consistently quicker time with this:-

        static int[] ConvertToArrayOfDigits_unrolled(int num)
        {
            if (num < 10)
            {
                return new int[1] 
                {
                    (num % 10) 
                };
            }
            else if (num < 100)
            {
                return new int[2] 
                {
                    (num / 10 % 10),
                    (num % 10)
                };
            }
            else if (num < 1000)
            {
                return new int[3] {
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else if (num < 10000)
            {
                return new int[4] {
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else if (num < 100000)
            {
                return new int[5] {
                (num / 10000 % 10),
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else if (num < 1000000)
            {
                return new int[6] {
                (num / 100000 % 10),
                (num / 10000 % 10),
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else if (num < 10000000)
            {
                return new int[7] {
                (num / 1000000 % 10),
                (num / 100000 % 10),
                (num / 10000 % 10),
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else if (num < 100000000)
            {
                return new int[8] {
                (num / 10000000 % 10),
                (num / 1000000 % 10),
                (num / 100000 % 10),
                (num / 10000 % 10),
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else if (num < 1000000000)
            {
                return new int[9] {
                (num / 100000000 % 10),
                (num / 10000000 % 10),
                (num / 1000000 % 10),
                (num / 100000 % 10),
                (num / 10000 % 10),
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
            else
            {
                return new int[10] {
                (num / 1000000000 % 10),
                (num / 100000000 % 10),
                (num / 10000000 % 10),
                (num / 1000000 % 10),
                (num / 100000 % 10),
                (num / 10000 % 10),
                (num / 1000 % 10),
                (num / 100 % 10),
                (num / 10 % 10),
                (num % 10)};
            }
        }
    

    It may be I've messed up somewhere - I don't have much time for fun and games, but I was timing this as 20% quicker.

    0 讨论(0)
提交回复
热议问题