问题
I need to match a host name--but don't want the tld:
example.com =~ /regex/ => example
sub.example.com =~ /regex/ => sub.example
sub.sub.example.com =~ /regex/ => sub.sub.example
Any help with the regex? Thanks.
回答1:
Assuming your string is correctly formatted and doesn't include things like protocol [i.e. http://], you need all characters up to but not including the final .tld.
So this is the simplest way to do this. The trick with regular expressions is not to overcomplicate things:
.*(?=\.\w+)
This basically says, give me all characters in the set that is followed by [for example] .xxx, which will basically just return everything prior to the last period.
If you don't have lookahead, it would probably be easiest to use:
(\w+\.)+
which will give you everything up to and including the final '.' and then just trim the '.'.
回答2:
Try this
/.+(?=\.\w+$)/
without support of the ?= it would be
/(.+)\.\w+$/
and then take the content of the first group
回答3:
You could just strip off the tld:
s/\.[^\.]*$//;
回答4:
(?<Domain>.*)\.(?<TLD>.*?)$
回答5:
(.*)\.
This isn't really specific to tlds, it'll just give you everything before the last period in a line. If you want to be strict about valid TLDs or anything, it'll have to be written differently.
回答6:
I'm not clear how you want to make the match work. but with the usual extended regex, you should be able to match any tld with [a-zA-Z]{2,3}
So if you're trying to get the whole name other than the tld, something like
\(.\)\.[a-zA-Z]{2,3}$
should be close.
来源:https://stackoverflow.com/questions/836536/regex-match-a-hostname-not-including-the-tld