Regular expression to find all “src” attribute of HTML “img” element only folder in PHP

前端未结

关注

 2  1296

I have a string, inside of that I have an image:

\"balbalba


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2021-01-29 08:17
              
            
            
                                                                       
You could do this with a parser and a simple regex to check the attribute starts with required directory...

$string = '<img src="img/programacao/51.jpg" style="width:200px;" /><p>balbalba</p><img src="img/programacao/46.jpg" style="width:200px;" /><p>balbalba</p><img src="/img/finalCinerio.jpg"><p>balbalba</p><img src="img/topo.jpg" />';
$doc = new DOMDocument();
$doc->loadHTML($string);
$images = $doc->getElementsByTagName('img');
foreach ($images as $image) {
    if(preg_match('~^img/programacao/~', $image->getAttribute('src'))) {
        echo $image->getAttribute('src') . "\n";
    }
}


Output:

img/programacao/51.jpg
img/programacao/46.jpg

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天涯浪人        
                
              
                            
                2021-01-29 08:26
              
            
            
                                                                       
Easy beasy

 '/src=\"(?P<src>img\/programacao\/[^\"]+)\"/'


You don't really need the img tag unless you have a lot of iframes or style/script tags.  You can add it in but it makes reliably matching much, much harder.  The reason is there is no guarantee where the src attribute will show. 

Regx101

Most of this is pretty simple, literal matches


[^\"]+ = Not a quote ( match more then one ) matches any sequence that is not a quote.  I prefer this then the .*? match anything un-greedy for readability mostly 
?P<src> named ( ... ) capture group, return matches with a string key of src


I love named capture groups, although not as useful here with a single match.  However, it's main purpose is readability and it allows you to change your code latter. Such as adding another capture group and not worry about the match number changing on you, for example.

If you want to get really fancy 

\<img.*?(?<!src=)src=(?P<quote>\"|\')(?P<src>img\/programacao\/[^\k<quote>]+)\k<quote>



(?<!src=) negative look behind match anything .*? (non-greedy) if not src=
\k<quote> back reference to the quote capture group, basically means the quote style ' vs " must match


Although to be honest it's probably overkill.

fancy demo

You could also use preg_match_all for this, but it depends how you read the file.  If you are reading it line for line then use preg_match.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复