regex c# extracting url from tag

前端未结

关注

 2  612

梦毁少年i 2021-01-25 18:51

I am trying to extract URL from an tag, however, instead of getting https://website.com/-id1, I am getting tag link text. Here is my code:

string text=\"


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   挽巷
                                             
                
                
                (楼主)
            
              
              
                2021-01-25 19:14
              

            
            
                        
Regular expressions can be used in very specific, simple cases with HTML. For example, if the text contains only a single tag, you can use "href\\s*=\\s*\"(?.*?)\"" to extract the URL, eg:

var url=Regex.Match(text,"href\\s*=\\s*\"(?.*?)\"").Groups["url"].Value;


This pattern will return :

https://website.com/-id1


This regex doesn't do anything fancy. It looks for href= with possible whitespace and then captures anything between the first double quote and the next in a non-greedy manner (.*?). This is captured in the named group url.

Anything more fancy and things get very complex. For example, supporting both single and double quotes would require special handling to avoid starting on a single and ending on a double quote. The string could multiple  tags that used both types of quotes. 


For complex parsing it would be better to use a library like AngleSharp or HtmlAgilityPack
    
             
                                                        
            

            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复