Why can't Python's raw string literals end with a single backslash?

后端未结

关注

 12  1197

Technically, any odd number of backslashes, as described in the documentation.

>>> r\'\\\'
  File \"\", line 1
    r\'\\\'
       ^
Syn


                      
              相关标签:


      
      
        
          12条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  小蘑菇        
                
              
                            
                2020-11-22 07:57
              
            
            
                                                                       
The whole misconception about python's raw strings is that most of people think that backslash (within a raw string) is just a regular character as all others. It is NOT. The key to understand is this python's tutorial sequence:


  When an 'r' or 'R' prefix is present, a character following a
  backslash is included in the string without change, and all
  backslashes are left in the string


So any character following a backslash is part of raw string. Once parser enters a raw string (non Unicode one) and encounters a backslash it knows there are 2 characters (a backslash and a char following it).

This way:


  r'abc\d' comprises a, b, c, \, d
  
  r'abc\'d' comprises a, b, c, \, ', d
  
  r'abc\'' comprises a, b, c, \, '


and:


  r'abc\' comprises a, b, c, \, ' but there is no terminating quote now.


Last case shows that according to documentation now a parser cannot find closing quote as the last quote you see above is part of the string i.e. backslash cannot be last here as it will 'devour' string closing char.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2020-11-22 07:59
              
            
            
                                                                       
The reason is explained in the part of that section which I highlighted in bold:


  String quotes can be escaped with a
  backslash, but the backslash remains
  in the string; for example, r"\"" is a
  valid string literal consisting of two
  characters: a backslash and a double
  quote; r"\" is not a valid string
  literal (even a raw string cannot end
  in an odd number of backslashes).
  Specifically, a raw string cannot end
  in a single backslash (since the
  backslash would escape the following
  quote character). Note also that a
  single backslash followed by a newline
  is interpreted as those two characters
  as part of the string, not as a line
  continuation.


So raw strings are not 100% raw, there is still some rudimentary backslash-processing.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  失恋的感觉        
                
              
                            
                2020-11-22 07:59
              
            
            
                                                                       
Another user who has since deleted their answer (not sure if they'd like to be credited) suggested that the Python language designers may be able to simplify the parser design by using the same parsing rules and expanding escaped characters to raw form as an afterthought (if the literal was marked as raw).

I thought it was an interesting idea and am including it as community wiki for posterity.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  死守一世寂寞        
                
              
                            
                2020-11-22 08:04
              
            
            
                                                                       

  Despite its role, even a raw string cannot end in a single
  backslash, because the backslash escapes the following quote
  character—you still must escape the surrounding quote character to
  embed it in the string.  That is, r"...\" is not a valid string
  literal—a raw string cannot end in an odd number of backslashes.

  If you need to end a raw string with a single backslash, you can use
  two and slice off the second.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  故里飘歌        
                
              
                            
                2020-11-22 08:05
              
            
            
                                                                       
Another trick is to use chr(92) as it evaluates to "\". 

I recently had to clean a string of backslashes and the following did the trick:

CleanString = DirtyString.replace(chr(92),'')


I realize that this does not take care of the "why" but the thread attracts many people looking for a solution to an immediate problem.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  隐瞒了意图╮        
                
              
                            
                2020-11-22 08:07
              
            
            
                                                                       
some tips :

1) if you need to manipulate backslash for path then standard python module os.path is your friend. for example : 


  os.path.normpath('c:/folder1/')


2) if you want to build strings with backslash in it BUT without backslash at the END of your string then raw string is your friend (use 'r' prefix before your literal string). for example : 

r'\one \two \three'


3) if you need to prefix a string in a variable X with a backslash then you can do this :

X='dummy'
bs=r'\ ' # don't forget the space after backslash or you will get EOL error
X2=bs[0]+X  # X2 now contains \dummy


4) if you need to create a string with a backslash at the end then combine tip 2 and 3 :

voice_name='upper'
lilypond_display=r'\DisplayLilyMusic \ ' # don't forget the space at the end
lilypond_statement=lilypond_display[:-1]+voice_name


now lilypond_statement contains "\DisplayLilyMusic \upper"

long live python ! :)

n3on
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复