Get specific subdomain from URL in foo.bar.car.com

后端未结

关注

 7  1733

清酒与你

Given a URL as follows:

foo.bar.car.com.au

I need to extract foo.bar.

I came across the following code :

pr


                      
              相关标签:


      
      
        
          7条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  终归单人心        
                
              
                            
                2020-11-30 10:30
              
            
            
                                                                       
In addition to the NuGet Nager.PubilcSuffix package specified in this answer, there is also the NuGet Louw.PublicSuffix package, which according to its GitHub project page is a .Net Core Library that parses Public Suffix, and is based on the Nager.PublicSuffix project, with the following changes:


Ported to .NET Core Library.
Fixed library so it passes ALL the comprehensive tests.
Refactored classes to split functionality into smaller focused classes.
Made classes immutable. Thus DomainParser can be used as singleton and is thread safe.
Added WebTldRuleProvider and FileTldRuleProvider.
Added functionality to know if Rule was a ICANN or Private domain rule.
Use async programming model


The page also states that many of above changes were submitted back to original Nager.PublicSuffix project.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2020-11-30 10:35
              
            
            
                                                                       
I would recommend using Regular Expression. The following code snippet should extract what you are looking for...

string input = "foo.bar.car.com.au";
var match = Regex.Match(input, @"^\w*\.\w*\.\w*");
var output = match.Value;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  生来不讨喜        
                
              
                            
                2020-11-30 10:37
              
            
            
                                                                       
OK, first.  Are you specifically looking in 'com.au', or are these general Internet domain names?  Because if it's the latter, there is simply no automatic way to determine how much of the domain is a "site" or "zone" or whatever and how much is an individual "host" or other record within that zone.  

If you need to be able to figure that out from an arbitrary domain name, you will want to grab the list of TLDs from the Mozilla Public Suffix project (http://publicsuffix.org) and use their algorithm to find the TLD in your domain name.  Then you can assume that the portion you want ends with the last label immediately before the TLD. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2020-11-30 10:42
              
            
            
                                                                       
Given your requirement (you want the 1st two levels, not including 'www.') I'd approach it something like this:

private static string GetSubDomain(Uri url)
{

    if (url.HostNameType == UriHostNameType.Dns)
    {

        string host = url.Host;

        var nodes = host.Split('.');
        int startNode = 0;
        if(nodes[0] == "www") startNode = 1;

        return string.Format("{0}.{1}", nodes[startNode], nodes[startNode + 1]);

    }

    return null; 
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  抹茶落季        
                
              
                            
                2020-11-30 10:45
              
            
            
                                                                       
I faced a similar problem and, based on the preceding answers, wrote this extension method.  Most importantly, it takes a parameter that defines the "root" domain, i.e. whatever the consumer of the method considers to be the root.  In the OP's case, the call would be

Uri uri = "foo.bar.car.com.au";
uri.DnsSafeHost.GetSubdomain("car.com.au"); // returns foo.bar
uri.DnsSafeHost.GetSubdomain(); // returns foo.bar.car


Here's the extension method:

/// <summary>Gets the subdomain portion of a url, given a known "root" domain</summary>
public static string GetSubdomain(this string url, string domain = null)
{
  var subdomain = url;
  if(subdomain != null)
  {
    if(domain == null)
    {
      // Since we were not provided with a known domain, assume that second-to-last period divides the subdomain from the domain.
      var nodes = url.Split('.');
      var lastNodeIndex = nodes.Length - 1;
      if(lastNodeIndex > 0)
        domain = nodes[lastNodeIndex-1] + "." + nodes[lastNodeIndex];
    }

    // Verify that what we think is the domain is truly the ending of the hostname... otherwise we're hooped.
    if (!subdomain.EndsWith(domain))
      throw new ArgumentException("Site was not loaded from the expected domain");

    // Quash the domain portion, which should leave us with the subdomain and a trailing dot IF there is a subdomain.
    subdomain = subdomain.Replace(domain, "");
    // Check if we have anything left.  If we don't, there was no subdomain, the request was directly to the root domain:
    if (string.IsNullOrWhiteSpace(subdomain))
      return null;

    // Quash any trailing periods
    subdomain = subdomain.TrimEnd(new[] {'.'});
  }

  return subdomain;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2020-11-30 10:49
              
            
            
                                                                       
You can use the following nuget package Nager.PublicSuffix. It uses the PUBLIC SUFFIX LIST from Mozilla to split the domain.

PM> Install-Package Nager.PublicSuffix


Example

 var domainParser = new DomainParser();
 var data = await domainParser.LoadDataAsync();
 var tldRules = domainParser.ParseRules(data);
 domainParser.AddRules(tldRules);

 var domainName = domainParser.Get("sub.test.co.uk");
 //domainName.Domain = "test";
 //domainName.Hostname = "sub.test.co.uk";
 //domainName.RegistrableDomain = "test.co.uk";
 //domainName.SubDomain = "sub";
 //domainName.TLD = "co.uk";

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复