Do I need to explicitly handle negative numbers or zero when summing squared digits?

前端 未结 9 1210
無奈伤痛
無奈伤痛 2021-01-30 01:51

I recently had a test in my class. One of the problems was the following:

Given a number n, write a function in C/C++ that returns the su

相关标签:
9条回答
  • 2021-01-30 02:21

    NOTE: AS I was writing this answer, you did clarify that you are using C. The majority of my answer is about C++. However, since your title still has C++ and the question is still tagged C++, I have chosen to answer anyway in case this is still useful to other people, especially since most of the answers I've seen till now are mostly unsatisfactory.

    In modern C++ (Note: I don't really know where C stands on this), your professor seems to be wrong on both counts.

    First is this part right here:

    if (n == 0) {
            return 0;
    }
    

    In C++, this is basically the same thing as:

    if (!n) {
            return 0;
    }
    

    That means your while is equivalent to something like this:

    while(n != 0) {
        // some implementation
    }
    

    That means since you are merely exiting in your if when the while wouldn't execute anyway, there really isn't a reason to put this if here, since what you are doing after the loop and in the if are equivalent anyway. Although I should say that is for some reason these were different, you'd need to have this if.

    So really, this if statement isn't particularly useful unless I'm mistaken.

    The second part is where things get hairy:

    if (n < 0) {
        n = n * (-1);
    }  
    

    The heart of the issue is is what the output of the modulus of a negative number outputs.

    In modern C++, this seems to be mostly well defined:

    The binary / operator yields the quotient, and the binary % operator yields the remainder from the division of the first expression by the second. If the second operand of / or % is zero the behavior is undefined. For integral operands the / operator yields the algebraic quotient with any fractional part discarded; if the quotient a/b is representable in the type of the result, (a/b)*b + a%b is equal to a.

    And later:

    If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.

    As the poster of the quoted answer correctly points out, the important part of this equation right here:

    (a/b)*b + a%b

    Taking an example of your case, you'd get something like this:

    -13/ 10 = -1 (integer truncation)
    -1 * 10 = -10
    -13 - (-10) = -13 + 10 = -3 
    

    The only catch is that last line:

    If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.

    That means that in a case like this, only the sign seems to be implementation-defined. That shouldn't be a problem in your case because, because you are squaring this value anyway.

    That said, keep in mind that this doesn't necessarily apply to earlier versions of C++, or C99. If that is what your professor is using, that could be why.


    EDIT: Nope, I'm wrong. This seems to be the case for C99 or later as well:

    C99 requires that when a/b is representable:

    (a/b) * b + a%b shall equal a

    And another place:

    When integers are divided and the division is inexact, if both operands are positive the result of the / operator is the largest integer less than the algebraic quotient and the result of the % operator is positive. If either operand is negative, whether the result of the / operator is the largest integer less than the algebraic quotient or the smallest integer greater than the algebraic quotient is implementation-defined, as is the sign of the result of the % operator. If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a.

    Does either ANSI C or ISO C specify what -5 % 10 should be?

    So, yeah. Even in C99, this doesn't seem to affect you. The equation is the same.

    0 讨论(0)
  • 2021-01-30 02:22

    The problem statement is confusing, but the numerical example clarifies the meaning of the sum of the digits of the number squared. Here is an improved version:

    Write a function in the common subset of C and C++ that takes an integer n in the range [-107, 107] and returns the sum of the squares of the digits of its representation in base 10. Example: if n is 123, your function should return 14 (12 + 22 + 32 = 14).

    The function that you wrote is fine except for 2 details:

    • The argument should have type long to accommodate for all values in the specified range as type long is guaranteed by the C Standard to have at least 31 value bits, hence a range sufficient to represent all values in [-107, 107]. (Note that type int is sufficient for the return type, whose maximum value is 568.)
    • The behavior of % for negative operands is non-intuitive and its specification varied between the C99 Standard and previous editions. You should document why your approach is valid even for negative inputs.

    Here is a modified version:

    int sum_of_digits_squared(long n) {
        int s = 0;
    
        while (n != 0) {
            /* Since integer division is defined to truncate toward 0 in C99 and C++98 and later,
               the remainder of this division is positive for positive `n`
               and negative for negative `n`, and its absolute value is the last digit
               of the representation of `n` in base 10.
               Squaring this value yields the expected result for both positive and negative `c`.
               dividing `n` by 10 effectively drops the last digit in both cases.
               The loop will not be entered for `n == 0`, producing the correct result `s = 0`.
             */
            int c = n % 10;
            s += c * c;
            n /= 10;
        }
        return s;
    }
    

    The teacher's answer has multiple flaws:

    • type int may have an insufficient range of values.
    • there is no need to special case the value 0.
    • negating negative values is unnecessary and may have undefined behavior for n = INT_MIN.

    Given the extra constraints in the problem statement (C99 and range of values for n), only the first flaw is an issue. The extra code still produces the correct answers.

    You should get a good mark in this test, but the explanation is required in a written test to show your understanding of the issues for negative n, otherwise the teacher may assume that you were unaware and just got lucky. In an oral exam, you would have gotten a question and your answer would have nailed it.

    0 讨论(0)
  • 2021-01-30 02:26

    Your code is perfectly fine

    You are absolutely correct and your teacher is wrong. There is absolutely no reason at all to add that extra complexity, since it does not affect the result at all. It even introduces a bug. (See below)

    First, the separate check if n is zero is obviously completely unnecessary and this is very easy to realize. To be honest, I actually question your teachers competence if he has objections about this. But I guess everybody can have a brain fart from time to time. However, I DO think that while(n) should be changed to while(n != 0) because it adds a little bit extra clarity without even costing an extra line. It's a minor thing though.

    The second one is a bit more understandable, but he is still wrong.

    This is what the C11 standard 6.5.5.p6 says:

    If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is undefined.

    The footnote says this:

    This is often called "truncation toward zero".

    Truncation toward zero means that the absolute value for a/b is equal to the absolute value for (-a)/b for all a and b, which in turn means that your code is perfectly fine.

    Modulo is easy math, but may be counterintuitive

    However, your teacher does have a point that you should be careful, because the fact that you're squaring the result is actually crucial here. Calculating a%b according to above definition is easy math, but it might go against your intuition. For multiplication and division, the result is positive if the operands have equal sign. But when it comes to modulo, the result has the same sign as the first operand. The second operand does not affect the sign at all. For instance, 7%3==1 but (-7)%(-3)==(-1).

    Here is a snippet demonstrating it:

    $ cat > main.c 
    #include <stdio.h>
    
    void f(int a, int b) 
    {
        printf("a: %2d b: %2d a/b: %2d a\%b: %2d (a%b)^2: %2d (a/b)*b+a%b==a: %5s\n",
               a, b ,a/b, a%b, (a%b)*(a%b), (a/b)*b+a%b == a ? "true" : "false");
    }
    
    int main(void)
    {
        int a=7, b=3;
        f(a,b);
        f(-a,b);
        f(a,-b);
        f(-a,-b);
    }
    
    $ gcc main.c -Wall -Wextra -pedantic -std=c99
    
    $ ./a.out
    a:  7 b:  3 a/b:  2 a%b:  1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
    a: -7 b:  3 a/b: -2 a%b: -1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
    a:  7 b: -3 a/b: -2 a%b:  1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
    a: -7 b: -3 a/b:  2 a%b: -1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
    

    So, ironically, your teacher proved his point by being wrong.

    Your teacher's code is flawed

    Yes, it actually is. If the input is INT_MIN AND the architecture is two's complement AND the bit pattern where the sign bit is 1 and all value bits are 0 is NOT a trap value (using two's complement without trap values is very common) then your teacher's code will yield undefined behavior on the line n = n * (-1). Your code is - if ever so slightly - better than his. And considering introducing a small bug by making the code unnecessary complex and gaining absolutely zero value, I'd say that your code is MUCH better.

    In other words, in compilations where INT_MIN = -32768 (even though the resulting function cannot receive an input that is < -32768 or > 32767), the valid input of -32768 causes undefined behavior, because the result of -(-32768i16) cannot be expressed as a 16-bit integer. (Actually, -32768 probably would not cause an incorrect result, because -(-32768i16) usually evaluates to -32768i16, and your program handles negative numbers correctly.) (SHRT_MIN could be -32768 or -32767, depending on the compiler.)

    But your teacher explicitly stated that n can be in the range [-10^7; 10^7]. A 16-bit integer is too small; you would have to use [at least] a 32-bit integer. Using int might seem to make his code safe, except that int is not necessarily a 32-bit integer. If you compile for a 16-bit architecture, both of your code snippets are flawed. But your code is still much better because this scenario reintroduces the bug with INT_MIN mentioned above with his version. To avoid this, you can write long instead of int, which is a 32-bit integer on either architecture. A long is guaranteed to be able to hold any value in the range [-2147483647; 2147483647]. C11 Standard 5.2.4.2.1 LONG_MIN is often -2147483648 but the maximum (yes, maximum, it's a negative number) allowed value for LONG_MIN is -2147483647.

    What changes would I make to your code?

    Your code is fine as it is, so these are not really complaints. It's more like that if I really, really need to say anything about your code, there are some small things that could make it just a tiny bit clearer.

    • The names of the variables could be a little bit better, but it is a short function that is easy to understand, so it's not a big deal.
    • You could change the condition from n to n!=0. Semantically, it's 100% equivalent, but it makes it a little bit clearer.
    • Move declaration of c (which I renamed to digit) to inside the while loop since it's only used there.
    • Change argument type to long to ensure it can handle the whole input set.
    int sum_of_digits_squared(long n) 
    {
        long sum = 0;
    
        while (n != 0) {
            int digit = n % 10;
            sum += (digit * digit);
            n /= 10;
        }
    
        return sum;
    }
    

    Actually, this can be a little bit misleading because - as mentioned above - the variable digit can get a negative value, but a digit is in itself never either positive or negative. There are a few ways around this, but this is REALLY nitpicking, and I would not care for such small details. Especially the separate function for last digit is taking it too far. Ironically, this is one of the things that your teachers code actually solves.

    • Change sum += (digit * digit) to sum += ((n%10)*(n%10)) and skip the variable digit completely.
    • Change the sign of digit if negative. But I would strongly advice against making the code more complex just to make a variable name make sense. That's a VERY strong code smell.
    • Create a separate function that extracts the last digit. int last_digit(long n) { int digit=n%10; if (digit>=0) return digit; else return -digit; } This is useful if you want to use that function somewhere else.
    • Just name it c as you originally do. That variable name does not give any useful information, but on the other hand, it's not misleading either.

    But to be honest, at this point you should move on to more important work. :)

    0 讨论(0)
  • 2021-01-30 02:28

    I wouldn't argue about whether the original or the modern definition of '%' is better but anyone who writes two return statement into such a short function shouldn't teach C programming at all. Extra return is a goto statement and we don't use goto in C. Furthermore the code without the zero check would have the same result, extra return made it harder to read.

    0 讨论(0)
  • 2021-01-30 02:32

    As others have pointed out, the special treatment for n==0 is nonsense, since for every serious C programmer it is obvious that "while(n)" does the job.

    The behaviour for n<0 is not that obvious, that's why I would prefer to see those 2 lines of code:

    if (n < 0) 
        n = -n;
    

    or at least a comment:

    // don't worry, works for n < 0 as well
    

    Honestly, at what time did you start considering that n might be negative? When writing the code or when reading your teacher's remarks?

    0 讨论(0)
  • 2021-01-30 02:37

    Summarizing a discussion that's been percolating in the comments:

    • There is no good reason to test in advance for n == 0. The while(n) test will handle that case perfectly.
    • It's likely your teacher is still used to earlier times, when the result of % with negative operands was differently defined. On some old systems (including, notably, early Unix on a PDP-11, where Dennis Ritchie originally developed C), the result of a % b was always in the range [0 .. b-1], meaning that -123 % 10 was 7. On such a system, the test in advance for n < 0 would be necessary.

    But the second bullet applies only to earlier times. In the current versions of both the C and C++ standards, integer division is defined to truncate towards 0, so it turns out that n % 10 is guaranteed to give you the (possibly negative) last digit of n even when n is negative.

    So the answer to the question "What is the meaning of while(n)?" is "Exactly the same as while(n != 0)", and the answer to "Will this code work properly for negative as well as positive n?" is "Yes, under any modern, Standards-conforming compiler." The answer to the question "Then why did the instructor mark it down?" is probably that they're not aware of a significant language redefinition that happened to C in 1999 and to C++ in 2010 or so.

    0 讨论(0)
提交回复
热议问题