Youtube complete Java Regex

后端 未结 2 1365
无人共我
无人共我 2021-02-04 11:32

I need to parse several pages to get all of their Youtube IDs.

I found many regular expressions on the web, but : the Java ones are not complete (they either give me gar

2条回答
  •  庸人自扰
    2021-02-04 12:10

    First of all you need to insert and extra backslash \ foreach backslash in the old regex, else java thinks you escapes some other special characters in the string, which you are not doing.

    https?:\\/\\/(?:[0-9A-Z-]+\\.)?(?:youtu\\.be\\/|youtube\\.com\\S*[^\\w\\-\\s])([\\w\\-]{11})(?=[^\\w\\-]|$)(?![?=&+%\\w]*(?:['\"][^<>]*>|<\\/a>))[?=&+%\\w]*
    

    Next when you compile your pattern you need to add the CASE_INSENSITIVE flag. Here's an example:

    String pattern = "https?:\\/\\/(?:[0-9A-Z-]+\\.)?(?:youtu\\.be\\/|youtube\\.com\\S*[^\\w\\-\\s])([\\w\\-]{11})(?=[^\\w\\-]|$)(?![?=&+%\\w]*(?:['\"][^<>]*>|<\\/a>))[?=&+%\\w]*";
    
    Pattern compiledPattern = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
    Matcher matcher = compiledPattern.matcher(link);
    while(matcher.find()) {
        System.out.println(matcher.group());
    }
    

提交回复
热议问题