regex match main domain name

前端 未结 3 975
陌清茗
陌清茗 2021-01-19 04:38

I need to be able to identify a domain name of any subdomain.

Examples:

For all of thiese I need to match only example.co / example.com

相关标签:
3条回答
  • 2021-01-19 05:20

    If you want an absolutely correct matcher, regular expressions are not the way to go.

    Why?

    • Because both of these are valid domains + TLDs: goo.gl, t.co.

    • Because neither of these are (they're only TLDs): com.au, co.uk.

    Any regex that you might create that would properly handle all of the above cases would simply amount to listing out the valid TLDs, which would defeat the purpose of using regular expressions in the first place.

    Instead, just create/obtain a list of the current TLDs and see which one of them is present, then add the first segment before it.

    0 讨论(0)
  • 2021-01-19 05:29

    This will match:

    ([0-9A-Za-z]{2,}\.[0-9A-Za-z]{2,3}\.[0-9A-Za-z]{2,3}|[0-9A-Za-z]{2,}\.[0-9A-Za-z]{2,3})$
    

    as long as:

    1. there're no extra spaces at the end of each line
    2. all domain codes used are short, two or three letters long. Wil not work with long domain codes like .info.

    Bassically what it does is match any of these two:

    1. word two letters or longer:dot:two or three letters word:dot:two or three letters word:end of line
    2. word two letters or longer:dot:two or three letters word:end of line

    Short version:

    (\w{2,}\.\w{2,3}\.\w{2,3}|\w{2,}\.\w{2,3})$
    

    If you want it to only match whole lines, then add ^ at the beginning

    This is how I tested it:

    enter image description here

    0 讨论(0)
  • 2021-01-19 05:29

    Might this be of any use. This separates them into a dot notation. Then it is a simple matter of splitting it.
    [^/:"].[^/:"]

    0 讨论(0)
提交回复
热议问题