php preg_replace for property inside html tags

[亡魂溺海] 提交于 2020-02-03 01:55:49

问题


My problem is how to replace the src value of a <script> tag inside a string like in this example (well, I need this in a more general scenario of properties inside tags):

$data = <<<EOD
<script language="javascript" src= "../tests/ajax-navigation.js"></script>
...
<img src="../404.jpg" alt="404">
...
EOD;

I used this function in php:

class Search{
 public static function replaceProperty($data, $start, $end, $property, $alias, $limit = -1){
   //get blocks formed as: $start $property = "..." $end or $start $property = '...' $end
   $pattern = "!(".$start."){1}(.*?)".$property."\s*=\s*[\"\'](.*?)[\"\'](.*?)(".$end."){1}!s";
   $data = \preg_replace($pattern, "{$start}\${2}{$property}=\"{$alias}\"\${4}{$end}", $data, $limit);
   return $data;
 }
}

which I called like this:

 $data = Search::replaceProperty($data, "<script", ">", "src", $alias);

What is really strange is that both tags <script> and <img> get changed! Of course I can call it like

 $data = Search::replaceProperty($data, "<script", "</script>", "src", $alias);

but this doesn't answer the general case!

Just to clarify some points with regex:

i. the actual string to search for is:

$data = <<<EOD
<script language="javascript" src= "../tests/ajax-navigation.js"></script>
...
<script language="javascript" type="text/javascript">
...
<img src="../404.jpg" alt="404">
...
EOD;

ii. the regex $pattern = "!(".$start."){1}(.*?)".$property."\s*=\s*[\"\'](.*?)[\"\'](.*?)(".$end."){1}!s"; or in the simplest form $pattern = "%".$start."(.*?)".$property."\s*=\s*[\"\'](.*?)[\"\'](.*?)".$end."%s"; (just 3 subpatterns) identifies the first <script> as expected but...it takes the second <script> and terminates at the > of the first <img> changing whatever src property it finds in between!

iii. by deleting the s metacharacter at the end of the pattern resulting in $pattern = "%".$start."(.*?)".$property."\s*=\s*[\"\'](.*?)[\"\'](.*?)".$end."%"; behaves as expected but fails when the tags are broken with enters:

<script language="javascript" src= "../tests/ajax-navigation.js"
></script>

iv. and, of course my intention is to replace and not to delete the value at src property.

Hope these clarify my question.


回答1:


Change this line:

 public static function replaceProperty($data, $start, $end, $property, $alias, $limit = -1){

To this:

 public static function replaceProperty($data, $start, $end, $property, $alias='', $limit = -1){

Adding a default value of '' to the $alias parameter.

Also not sure what the backslash in front of the preg_replace is doing there. I had to remove that too.




回答2:


Here's some code I used to find all of a certain element with preg_match_all, I've found that preg_match_all is better for doing this than preg_match.

$arr = array();
preg_match_all("%[<]script.*?[>](.*?)[<][\/]script[>]%",$f, $arr, PREG_OFFSET_CAPTURE);
var_dump($arr);

Or with preg_replace:

$a = preg_replace("%[<]H3.*?[>].*?[<][\/]H3[>]%", "", $a);

Try preg_match all while following the syntax I used putting < and > like [<]$start instead of passing the < to the function. Also make sure it isn't case sensitive by using the respective preg_match options after % or convert all data using strtolower before. I'm sure if this works you can figure the rest out yourself.




回答3:


As I said I'll use DOMDocument() but here is an answer with regex:

class Search{

public function __construct(){}

public static function replaceProperty($data, $tag, $property, $alias, $limit = -1){
   //get blocks formed as: <$tag...$property=["|']...["|']...[/>|>]
   $pattern = '%<\s*'.$tag.'(\s+(\w+)(\s*\=\s*(\'|"|)(.*?)\\4\s*)?)*\s*(\/>|>)%s';
   $result = \preg_match_all($pattern, $data, $matches, PREG_PATTERN_ORDER);
   if(!empty($result)){
      $search = array();
      $replace = array();
      //found them at index = 0!
      foreach($matches[0] as $i=>$found){
         if(($limit >= 0) && ($i >= $limit))
            break;
         if(isset($matches[2]) && isset($matches[5]) && $matches[2][$i] == $property){
            $search[] = $found;
            $replace[] = \str_replace($matches[5][$i], $alias, $found);
         }
      }
      $data = \str_replace($search, $replace, $data);
   }
   return $data;
}
}

and called like this:

$data = Search::replaceProperty($data, "script", "src", $alias);

I used Emanuele Del Grande's answer from this post which might is a reproduction of posts like this!
Thanks.



来源:https://stackoverflow.com/questions/18004904/php-preg-replace-for-property-inside-html-tags

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!