public-suffix-list

Return root domain from url in R

一曲冷凌霜 提交于 2020-01-14 15:01:19
问题 Given website addresses, e.g. http://www.example.com/page1/# https://subdomain.example2.co.uk/asdf?retrieve=2 How do I return the root domain in R , e.g. example.com example2.co.uk For my purposes I would define the root domain to have structure example_name.public_suffix where example_name excludes "www" and public_suffix is on the list here: https://publicsuffix.org/list/effective_tld_names.dat Is this still the best regex based solution: https://stackoverflow.com/a/8498629/2109289 What

Return root domain from url in R

寵の児 提交于 2020-01-14 15:01:12
问题 Given website addresses, e.g. http://www.example.com/page1/# https://subdomain.example2.co.uk/asdf?retrieve=2 How do I return the root domain in R , e.g. example.com example2.co.uk For my purposes I would define the root domain to have structure example_name.public_suffix where example_name excludes "www" and public_suffix is on the list here: https://publicsuffix.org/list/effective_tld_names.dat Is this still the best regex based solution: https://stackoverflow.com/a/8498629/2109289 What