regex match main domain name

前端未结

关注

 3  978

I need to be able to identify a domain name of any subdomain.

Examples:

For all of thiese I need to match only example.co / example.com


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  闹比i        
                
              
                            
                2021-01-19 05:20
              
            
            
                                                                       
If you want an absolutely correct matcher, regular expressions are not the way to go.

Why?


Because both of these are valid domains + TLDs: goo.gl, t.co.
Because neither of these are (they're only TLDs): com.au, co.uk.


Any regex that you might create that would properly handle all of the above cases would simply amount to listing out the valid TLDs, which would defeat the purpose of using regular expressions in the first place.

Instead, just create/obtain a list of the current TLDs and see which one of them is present, then add the first segment before it.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2021-01-19 05:29
              
            
            
                                                                       
This will match:

([0-9A-Za-z]{2,}\.[0-9A-Za-z]{2,3}\.[0-9A-Za-z]{2,3}|[0-9A-Za-z]{2,}\.[0-9A-Za-z]{2,3})$


as long as:


there're no extra spaces at the end of each line
all domain codes used are short, two or three letters long. Wil not work with long domain codes like .info.


Bassically what it does is match any of these two:


word two letters or longer:dot:two or three letters word:dot:two or three letters word:end of line
word two letters or longer:dot:two or three letters word:end of line


Short version:

(\w{2,}\.\w{2,3}\.\w{2,3}|\w{2,}\.\w{2,3})$


If you want it to only match whole lines, then add ^ at the beginning

This is how I tested it:


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2021-01-19 05:29
              
            
            
                                                                       
Might this be of any use. This separates them into a dot notation.
Then it is a simple matter of splitting it.

    [^/:"].[^/:"]
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复