How to get regex to match multiple script tags?

后端未结

关注

 6  1978

I\'m trying to return the contents of any tags in a body of text. I\'m currently using the following expression, but it only captures the contents of the first tag and ign

相关标签:

6条回答

温柔的废话

2020-12-13 16:31
Don't use regular expressions for parsing HTML. HTML is not a regular language. Use the power of the DOM. This is much easier, because it is the right tool.
```
var scripts = document.getElementsByTagName('script');
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

耶瑟儿～

2020-12-13 16:39

The "problem" here is in how exec works. It matches only first occurrence, but stores current index (i.e. caret position) in lastIndex property of a regex. To get all matches simply apply regex to the string until it fails to match (this is a pretty common way to do it):

var scripttext = ' <script type="text/javascript">\nalert(\'1\');\n</script>\n\n<div>Test</div>\n\n<script type="text/javascript">\nalert(\'2\');\n</script>';

var re = /<script\b[^>]*>([\s\S]*?)<\/script>/gm;

var match;
while (match = re.exec(scripttext)) {
  // full match is in match[0], whereas captured groups are in ...[1], ...[2], etc.
  console.log(match[1]);
}

0 讨论(0)

轻奢々

2020-12-13 16:40
Try using the global flag:
```
document.body.innerHTML.match(/<script.*?>([\s\S]*?)<\/script>/gmi)
```
Edit: added multiple line and case insensitive flags (for obvious reasons).
0 讨论(0)
发布评论:

提交评论
- 加载中...

北荒

2020-12-13 16:43

try this

for each(var x in document.getElementsByTagName('script');
     if (x && x.innerHTML){
          var yourRegex = /http:\/\/\.*\.com/g;
          var matches = yourRegex.exec(x.innerHTML);
             if (matches){
          your code
 }}

0 讨论(0)

臣服心动

2020-12-13 16:49

In .Net, there's a submatch method, in PHP, preg_match_all, which should solve you problem. In Javascript there isn't such a method. But you can made by yourself.

Test in http://www.pagecolumn.com/tool/regtest.htm

Select $1elements method will return what you want

0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2020-12-13 16:54
The first group contains the content of the tags.

Edit: Don't you have to surround the regex-satement with quotes? Like:
```
re = "/<script\b[^>]*>([\s\S]*?)<\/script>/gm";
```
0 讨论(0)
发布评论:

提交评论
- 加载中...