Regex match a hostname — not including the TLD

冷暖自知 提交于 2019-12-22 10:18:23

问题


I need to match a host name--but don't want the tld:

example.com =~ /regex/ => example

sub.example.com =~ /regex/ => sub.example

sub.sub.example.com =~ /regex/ => sub.sub.example

Any help with the regex? Thanks.


回答1:


Assuming your string is correctly formatted and doesn't include things like protocol [i.e. http://], you need all characters up to but not including the final .tld.

So this is the simplest way to do this. The trick with regular expressions is not to overcomplicate things:

.*(?=\.\w+)

This basically says, give me all characters in the set that is followed by [for example] .xxx, which will basically just return everything prior to the last period.

If you don't have lookahead, it would probably be easiest to use:

(\w+\.)+

which will give you everything up to and including the final '.' and then just trim the '.'.




回答2:


Try this

/.+(?=\.\w+$)/

without support of the ?= it would be

/(.+)\.\w+$/

and then take the content of the first group




回答3:


You could just strip off the tld:

s/\.[^\.]*$//;



回答4:


(?<Domain>.*)\.(?<TLD>.*?)$



回答5:


(.*)\.

This isn't really specific to tlds, it'll just give you everything before the last period in a line. If you want to be strict about valid TLDs or anything, it'll have to be written differently.




回答6:


I'm not clear how you want to make the match work. but with the usual extended regex, you should be able to match any tld with [a-zA-Z]{2,3} So if you're trying to get the whole name other than the tld, something like

\(.\)\.[a-zA-Z]{2,3}$

should be close.



来源:https://stackoverflow.com/questions/836536/regex-match-a-hostname-not-including-the-tld

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!