Should reading negative into unsigned fail via std::cin (gcc, clang disagree)?

前端未结

关注

 3  1897

For example,

#include 

int main() {
  unsigned n{};
  std::cin >> n;
  std::cout << n << \' \' << (bool)std::cin <


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2020-12-30 21:06
              
            
            
                                                                       
I think that both are wrong in C++17¹ and that the expected output should be:
4294967295 0

While the returned value is correct for the latest versions of both compilers, I think that the ios_base::failbit should be set, but I also think there is a confusion about the notion of field to be converted in the standard which may account for the current behaviors.
The standard says — [facet.num.get.virtuals#3.3]:

The sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header <cstdlib>:

For a signed integer value, the function strtoll.

For an unsigned integer value, the function strtoull.

For a floating-point value, the function strtold.



So we fall back to std::strtoull, which must return² ULLONG_MAX  and not set errno in this case (which is what both compilers do).
But in the same block (emphasis is mine):

The numeric value to be stored can be one of:

zero, if the conversion function does not convert the entire field.

the most positive (or negative) representable value, if the field to be converted to a signed integer type represents a value too large positive (or negative) to be represented in val.

the most positive representable value, if the field to be converted to an unsigned integer type represents a value that cannot be represented in val.

the converted value, otherwise.


The resultant numeric value is stored in val. If the conversion function does not convert the entire field, or if the field represents a value outside the range of representable values, ios_base::failbit is assigned to err.

Notice that all these talks about the "field to be converted" and not the actual value returned by std::strtoull. The field here is actually the widened sequence of character '-', '1'.
Since the field represents a value (-1) that cannot be represented by an unsigned, the returned value should be UINT_MAX and the failbit should be set on std::cin.

_{¹clang was actually right prior to C++17 because the third bullet in the above quote was:}

_{- the most negative representable value or zero for an unsigned integer type, if the field represents a value too large negative to be represented in val. ios_base::failbit is assigned to err.}

_{² std::strtoull returns ULLONG_MAX because (thanks @NathanOliver) —  C/7.22.1.4.5:}

_{If the subject sequence has the expected form and the value of base is zero, the sequence of characters starting with the first digit is interpreted as an integer constant according to the rules of 6.4.4.1.
[...]
If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type).}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人共我        
                
              
                            
                2020-12-30 21:10
              
            
            
                                                                       
The question is about differences between the library implementations libc++ and libstdc++ - and not so much about differences between the compilers(clang, gcc).

cppreference clears these inconsistencies up pretty well:


  The result of converting a negative number string into an unsigned
  integer was specified to produce zero until c++17, although some
  implementations followed the protocol of std::strtoull which negates
  in the target type, giving ULLONG_MAX for "-1", and so produce the
  largest value of the target type instead. As of c++17, strictly
  following std::strtoull is the correct behavior.


This summarises to:


ULLONG_MAX (4294967295) is correct going forward, since c++17 (both compilers do it correct now)
Previously it should have been 0 with a strict reading of the standard (libc++)
Some implementations (notably libstdc++) followed std::strtoull protocol instead (which now is considered the correct behavior) 




The failbit set and why it was set, might be a more interesting question (at least from the language-lawyer perspective). In libc++ (clang) version 7 it now does the same as libstdc++ - this seems to suggest that it was chosen to be same as going forward (even though this goes against the letter of standard, that it should be zero before c++17) - but so far I've been unable to find changelog or documentation for this change.

The interesting block of text reads (assuming pre-c++17):


  If the conversion function results in a negative value too large to
  fit in the type of v, the most negative representable value is stored
  in v, or zero for unsigned integer types.


According to this, the value is specified to be 0. Additionally, no where is it indicated that this should result in setting the failbit.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2020-12-30 21:14
              
            
            
                                                                       
The intended semantics of your std::cin >> n command are described here (as, apparently, std::num_get::get() is called for this operation). There have been some semantics changes in this function, specifically w.r.t. the choice of whether to place 0 or not, in C++11 and then again in C++17.

I'm not entirely sure, but I believe these differences may account for the different behavior you're seeing.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


自定义标题
段落格式
字体
字号
代码语言
点击上传
x
                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复