How is the standarized way to calculate float with integers?

后端 未结 3 1037
日久生厌
日久生厌 2021-01-24 12:14

Do any of you know how this will be calculated in C?

uint8_t samplerate = 200;
uint8_t Result;
Result = 0.5 * samplerate;

Now, the problem is t

相关标签:
3条回答
  • 2021-01-24 12:48

    Yes, there is a standard. In this case, the numbers in the expression are automatically converted to the wider type (one that occupies more bytes), so your expression will be evaluated as follows:

    (0.5: double) * (0: uint8_t) => (0.5: double) * (0.0: double) == (0.0: double)
    uint8_t Result = (0.0: double) => (0: uint8_t) // this is a forced cast, because Result is of type uint8_t
    

    double is wider than uint8_t, so (0: uint8_t) is widened to (0.0: double). This cast doesn't lose information since double occupies enough space to fit all the data stored in uint8_t.

    0 讨论(0)
  • 2021-01-24 12:58

    Yes, of course this is controlled by the standard, there is no uncertainty here.

    Basically the integer will be promoted to double (since the type of 0.5 is double, it's not float) and the computation will happen there, then the result will be truncated back down to uint8_t. The compiler will shout at you for the loss of precision, typically. If it does not, add more warning options as required.

    0 讨论(0)
  • 2021-01-24 13:05

    When numeric values of various types are combined in a expression, they are subject to the usual arithmetic conversions, which is a set of rules which dictate which operand should be converted and to what type.

    These conversions are spelled out in section 6.3.1.8 of the C standard:

    Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions :

    • First, if the corresponding real type of either operand is long double , the other operand is converted, without change of type domain, to a type whose corresponding real type is long double .
    • Otherwise, if the corresponding real type of either operand is double , the other operand is converted, without change of type domain, to a type whose corresponding real type is double .
    • Otherwise, if the corresponding real type of either operand is float , the other operand is converted, without change of type domain, to a type whose corresponding real type is float .
    • Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
      • If both operands have the same type, then no further conversion is needed.
      • Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
      • Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
      • Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
      • Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

    Note in particular the paragraph in bold, which is what applies in your case.

    The floating point constant 0.5 has type double, so the value of other operand is converted to type double, and the result of the multiplication operator * has type double. This result is then assigned back to a variable of type uint8_t, so the double value is converted to this type for assignment.

    So in this case Result will have the value 100.

    0 讨论(0)
提交回复
热议问题