How to parse signed zero?

后端 未结 3 1133
北海茫月
北海茫月 2021-01-07 19:37

Is it possible to parse signed zero? I tried several approaches but no one gives the proper result:

float test1 = Convert.ToSingle(\"-0.0\");
float test2 = f         


        
相关标签:
3条回答
  • 2021-01-07 19:57

    I think there is no way to force float.Parse (or Convert.ToSingle) to respect negative zero. It just works like this (ignores sign in this case). So you have to check that yourself, for example:

    string target = "-0.0";            
    float result = float.Parse(target, CultureInfo.InvariantCulture);
    if (result == 0f && target.TrimStart().StartsWith("-"))
        result = -0f;
    

    If we look at source code for coreclr, we'll see (skipping all irrelevant parts):

    private static bool NumberBufferToDouble(ref NumberBuffer number, ref double value)
    {
        double d = NumberToDouble(ref number);
        uint e = DoubleHelper.Exponent(d);
        ulong m = DoubleHelper.Mantissa(d);
    
        if (e == 0x7FF)
        {
            return false;
        }
    
        if (e == 0 && m == 0)
        {
            d = 0; // < relevant part
        }
    
        value = d;
        return true;
    }
    

    As you see, if mantissa and exponent are both zero - value is explicitly assigned to 0. So there is no way you can change that.

    Full .NET implementation has NumberBufferToDouble as InternalCall (implemented in pure C\C++), but I assume it does something similar.

    0 讨论(0)
  • 2021-01-07 20:01

    You can try this:

    string target = "-0.0";  
    decimal result= (decimal.Parse(target,
                     System.Globalization.NumberStyles.AllowParentheses |
                     System.Globalization.NumberStyles.AllowLeadingWhite |
                     System.Globalization.NumberStyles.AllowTrailingWhite |
                     System.Globalization.NumberStyles.AllowThousands |
                     System.Globalization.NumberStyles.AllowDecimalPoint |
                     System.Globalization.NumberStyles.AllowLeadingSign));
    
    0 讨论(0)
  • 2021-01-07 20:02

    Updated Results

    Summary

    Mode            : Release
    Test Framework  : .NET Framework 4.7.1
    Benchmarks runs : 100 times (averaged/scale)
    
    Tests limited to 10 digits
    Name            |      Time |    Range | StdDev |      Cycles | Pass
    -----------------------------------------------------------------------
    Mine Unchecked  |  9.645 ms | 0.259 ms |   0.30 |  32,815,064 | Yes
    Mine Unchecked2 | 10.863 ms | 1.337 ms |   0.35 |  36,959,457 | Yes
    Mine Safe       | 11.908 ms | 0.993 ms |   0.53 |  40,541,885 | Yes
    float.Parse     | 26.973 ms | 0.525 ms |   1.40 |  91,755,742 | Yes
    Evk             | 31.513 ms | 1.515 ms |   7.96 | 103,288,681 | Base
    
    
    Test Limited to 38 digits 
    Name            |      Time |    Range | StdDev |      Cycles | Pass
    -----------------------------------------------------------------------
    Mine Unchecked  | 17.694 ms | 0.276 ms |   0.50 |  60,178,511 | No
    Mine Unchecked2 | 23.980 ms | 0.417 ms |   0.34 |  81,641,998 | Yes
    Mine Safe       | 25.078 ms | 0.124 ms |   0.63 |  85,306,389 | Yes
    float.Parse     | 36.985 ms | 0.052 ms |   1.60 | 125,929,286 | Yes
    Evk             | 39.159 ms | 0.406 ms |   3.26 | 133,043,100 | Base
    
    
    Test Limited to 98 digits (way over the range of a float)
    Name            |      Time |    Range | StdDev |      Cycles | Pass
    -----------------------------------------------------------------------
    Mine Unchecked2 | 46.780 ms | 0.580 ms |   0.57 | 159,272,055 | Yes
    Mine Safe       | 48.048 ms | 0.566 ms |   0.63 | 163,601,133 | Yes
    Mine Unchecked  | 48.528 ms | 1.056 ms |   0.58 | 165,238,857 | No
    float.Parse     | 55.935 ms | 1.461 ms |   0.95 | 190,456,039 | Yes
    Evk             | 56.636 ms | 0.429 ms |   1.75 | 192,531,045 | Base
    

    Verifiably, Mine Unchecked is good for smaller numbers however when using powers at the end of the calculation to do fractional numbers it doesn't work for larger digit combinations, also because its just powers of 10 it plays with a i just a big switch statement which makes it marginally faster.

    Background

    Ok because of the various comments I got, and the work I put into this. I thought I’d rewrite this post with the most accurate benchmarks I could get. And all the logic behind them

    So when this first question come up, id had written my own benchmark framework and often just like writing a quick parser for these things and using unsafe code, 9 times out of 10 I can get this stuff faster than the framework equivalent.

    At first this was easy, just write a simple logic to parse numbers with decimal point places, and I did pretty well, however the initial results weren’t as accurate as they could have been, because my test data was just using the ‘f’ format specifier, and would turn larger precision numbers in to short formats with only 0’s.

    In the end I just couldn’t write a reliable parses to deal with exponent notation I.e 1.2324234233E+23. The only way I could get the maths to work was using BIGINTEGER and lots of hacks to force the right precision into a floating point value. This turned to be super slow. I even went to the float IEEE specs and try to do the maths to construct it in bits, this wasn’t that hard, and however the formula has loops in it and was complicated to get right. In the end I had to give up on exponent notation.

    So this is what I ended up with

    My testing framework runs on input data a list of 10000 flaots as strings, which is shared across the tests and generated for each test run, A test run is just going through the each test (remembering it’s the same data for each test) and adds up the results then averages them. This is about as good as it can get. I can increase the runs to 1000 or factors more however they don’t really change. In this case because we are testing a method that takes basically one variable (a string representation of a float) there is no point scaling this as its not set based, however I can tweak the input to cater for different lengths of floats, i.e., strings that are 10, 20 right up to 98 digits. Remembering a float only goes up to 38 anyway.

    To check the results I used the following, I have previously written a test unit that covers every float conceivable, and they work, except for a variation where I use Powers to calculate the decimal part of the number.

    Note, my framework only tests 1 result set, and its not part of the framework

    private bool Action(List<float> floats, List<float> list)
    {
       if (floats.Count != list.Count)
          return false; // sanity check
    
       for (int i = 0; i < list.Count; i++)
       {
          // nan is a special case as there is more than one possible bit value
          // for it
          if (  floats[i] != list[i] && !float.IsNaN(floats[i]) && !float.IsNaN(list[i]))
             return false;
       }
    
       return true;
    }
    

    In this case im testing again 3 types of input as shown below

    Setup

    // numberDecimalDigits specifies how long the output will be
    private static NumberFormatInfo GetNumberFormatInfo(int numberDecimalDigits)
    {
       return new NumberFormatInfo
                   {
                      NumberDecimalSeparator = ".",
                      NumberDecimalDigits = numberDecimalDigits
                   };
    }
    
    // generate a random float by create an int, and converting it to float in pointers
    
    private static unsafe string GetRadomFloatString(IFormatProvider formatInfo)
    {
       var val = Rand.Next(0, int.MaxValue);
       if (Rand.Next(0, 2) == 1)
          val *= -1;
       var f = *(float*)&val;
       return f.ToString("f", formatInfo);
    }
    

    Test Data 1

    // limits the out put to 10 characters
    // also because of that it has to check for trunced vales and
    // regenerates them
    public static List<string> GenerateInput10(int scale)
    {
       var result = new List<string>(scale);
       while (result.Count < scale)
       {
          var val = GetRadomFloatString(GetNumberFormatInfo(10));
          if (val != "0.0000000000")
             result.Add(val);
       }
    
       result.Insert(0, (-0f).ToString("f", CultureInfo.InvariantCulture));
       result.Insert(0, "-0");
          result.Insert(0, "0.00");
          result.Insert(0, float.NegativeInfinity.ToString("f", CultureInfo.InvariantCulture));
       result.Insert(0, float.PositiveInfinity.ToString("f", CultureInfo.InvariantCulture));
       return result;
    }
    

    Test Data 2

    // basically that max value for a float
    public static List<string> GenerateInput38(int scale)
    {
    
       var result = Enumerable.Range(1, scale)
                               .Select(x => GetRadomFloatString(GetNumberFormatInfo(38)))
                               .ToList();
    
       result.Insert(0, (-0f).ToString("f", CultureInfo.InvariantCulture));
       result.Insert(0, "-0");
       result.Insert(0, float.NegativeInfinity.ToString("f", CultureInfo.InvariantCulture));
       result.Insert(0, float.PositiveInfinity.ToString("f", CultureInfo.InvariantCulture));
       return result;
    }
    

    Test Data 3

    // Lets take this to the limit
    public static List<string> GenerateInput98(int scale)
    {
    
       var result = Enumerable.Range(1, scale)
                               .Select(x => GetRadomFloatString(GetNumberFormatInfo(98)))
                               .ToList();
    
       result.Insert(0, (-0f).ToString("f", CultureInfo.InvariantCulture));
       result.Insert(0, "-0");
       result.Insert(0, float.NegativeInfinity.ToString("f", CultureInfo.InvariantCulture));
       result.Insert(0, float.PositiveInfinity.ToString("f", CultureInfo.InvariantCulture));
       return result;
    }
    

    These are the tests I used

    Evk

    private float ParseMyFloat(string value)
    {
       var result = float.Parse(value, CultureInfo.InvariantCulture);
       if (result == 0f && value.TrimStart()
                                  .StartsWith("-"))
       {
          result = -0f;
       }
       return result;
    }
    

    Mine safe

    I call it safe as it tries to check for invalid strings

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private unsafe float ParseMyFloat(string value)
    {
       double result = 0, dec = 0;
    
       if (value[0] == 'N' && value == "NaN") return float.NaN;
       if (value[0] == 'I' && value == "Infinity")return float.PositiveInfinity;
       if (value[0] == '-' && value[1] == 'I' && value == "-Infinity")return float.NegativeInfinity;
    
    
       fixed (char* ptr = value)
       {
          char* l, e;
          char* start = ptr, length = ptr + value.Length;
    
          if (*ptr == '-') start++;
    
    
          for (l = start; *l >= '0' && *l <= '9' && l < length; l++)
             result = result * 10 + *l - 48;
    
    
          if (*l == '.')
          {
             char* r;
             for (r = length - 1; r > l && *r >= '0' && *r <= '9'; r--)
                dec = (dec + (*r - 48)) / 10;
    
             if (l != r)
                throw new FormatException($"Invalid float : {value}");
          }
          else if (l != length)
             throw new FormatException($"Invalid float : {value}");
    
          result += dec;
    
          return *ptr == '-' ? (float)result * -1 : (float)result;
       }
    }
    

    Unchecked

    This fails for larger strings, but is ok for smaller ones

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private unsafe float ParseMyFloat(string value)
    {
       if (value[0] == 'N' && value == "NaN") return float.NaN;
       if (value[0] == 'I' && value == "Infinity") return float.PositiveInfinity;
       if (value[0] == '-' && value[1] == 'I' && value == "-Infinity") return float.NegativeInfinity;
    
       fixed (char* ptr = value)
       {
          var point = 0;
          double result = 0, dec = 0;
    
          char* c, start = ptr, length = ptr + value.Length;
    
          if (*ptr == '-') start++;   
    
          for (c = start; c < length && *c != '.'; c++)
             result = result * 10 + *c - 48;
    
          if (*c == '.')
          {
             point = (int)(length - 1 - c);
             for (c++; c < length; c++)
                dec = dec * 10 + *c - 48;
          }
    
          // MyPow is just a massive switch statement
          if (dec > 0)
             result += dec / MyPow(point);
    
          return *ptr == '-' ? (float)result * -1 : (float)result;
       }
    }
    

    Unchecked 2

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private unsafe float ParseMyFloat(string value)
    {
    
       if (value[0] == 'N' && value == "NaN") return float.NaN;
       if (value[0] == 'I' && value == "Infinity") return float.PositiveInfinity;
       if (value[0] == '-' && value[1] == 'I' && value == "-Infinity") return float.NegativeInfinity;
    
    
       fixed (char* ptr = value)
       {
          double result = 0, dec = 0;
    
          char* c, start = ptr, length = ptr + value.Length;
    
          if (*ptr == '-') start++;
    
          for (c = start; c < length && *c != '.'; c++)
             result = result * 10 + *c - 48;     
    
          // this division seems unsafe for a double, 
          // however i have tested it with every float and it works
          if (*c == '.')
             for (var d = length - 1; d > c; d--)
                dec = (dec + (*d - 48)) / 10;
    
          result += dec;
    
          return *ptr == '-' ? (float)result * -1 : (float)result;
       }
    }
    

    Float.parse

    float.Parse(t, CultureInfo.InvariantCulture)
    

    Original Answer

    Assuming you don't need a TryParse method, i managed to use pointers and custom parsing to achieve what i think you want.

    The benchmark uses a list of 1,000,000 random floats and runs each version 100 times, all versions use the same data

    Test Framework : .NET Framework 4.7.1
    
    Scale : 1000000
    Name             |        Time |     Delta |  Deviation |       Cycles
    ----------------------------------------------------------------------
    Mine Unchecked2  |   45.585 ms |  1.283 ms |       1.70 |  155,051,452
    Mine Unchecked   |   46.388 ms |  1.812 ms |       1.17 |  157,751,710
    Mine Safe        |   46.694 ms |  2.651 ms |       1.07 |  158,697,413
    float.Parse      |  173.229 ms |  4.795 ms |       5.41 |  589,297,449
    Evk              |  287.931 ms |  7.447 ms |      11.96 |  979,598,364
    

    Chopped for brevity

    Note, Both these version cant deal with extended format, NaN, +Infinity, or -Infinity. However, it wouldn't be hard to implement at little overhead.

    I have checked this pretty well, though i must admit i havent written any unit tests, so use at your own risk.

    Disclaimer, I think Evk's StartsWith version could probably be more optimized, however it will still be (at best) slightly slower than float.Parse

    0 讨论(0)
提交回复
热议问题