How to recognize Facebook User-Agent

后端 未结 11 1322
隐瞒了意图╮
隐瞒了意图╮ 2020-11-27 15:35

When sharing one of my pages on FB, I want to display something different. Problem is, I prefer not to use the og: elements, but to recognize FB user-agent.

What is

相关标签:
11条回答
  • 2020-11-27 15:38

    Short solution is to check pattern, and not to load all the mess to user each time

    <?php
        # Facebook optimized stuff
        if(strstr($_SERVER['HTTP_USER_AGENT'],'facebookexternalhit')) {
            $buffer.='<link rel="image_src" href="images/site_thumbnail.png" />';
        }
    ?>
    
    0 讨论(0)
  • 2020-11-27 15:39

    Here are the Facebook crawlers User Agent:

    FacebookExternalHit/1.1
    FacebookExternalHit/1.0
    

    or

    facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
    facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
    

    Note that the version numbers might change. So use a regular expression to find the crawler name and then display your content.

    Update:

    You can use this code in PHP to check for Facebook User Agent

    if(preg_match('/^FacebookExternalHit\/.*?/i',$agent)){
        print "Facebook User-Agent";
        // process here for Facebook
    }
    

    Here is ASP.NET code. You can use this function to check if the userAgent is Facebook's useragent.

    public static bool IsFacebook(string userAgent)  
    {  
        userAgent = userAgent.ToLower();  
        return userAgent.Contains("facebookexternalhit");  
    }  
    

    Note:

    Why would you need to do that? When you share a link to your site on Facebook, facebook crawls it and parses it to get some data to display the thumbnail, title and some content from your page, but it would link back to your site.

    Also, I think this would lead to cloaking of the site, i.e. displaying different data to user and the crawlers. Cloaking is not considered a good practice and may search engines and site take note of it.

    Update: Facebook also added a new useragent as of May 28th, 2014

    Facebot
    

    You can read more about the facebook crawler on https://developers.facebook.com/docs/sharing/webmasters/crawler

    0 讨论(0)
  • 2020-11-27 15:42

    Firstly you should not use in_array as you will need to have the full user agent and not just a subset, thus will quickly break with changes (i.e. version 1.2 from facebook will not work if you follow the current preferred answer). It is also slower to iterate through an array rather than use a regex pattern.

    As no doubt you will want to look for more bot's later so I've given the example below with 2 bot names split in a pattern with the pipe | symbol. the /i at the end makes it case insensitive.

    Also you should not use $_SERVER['HTTP_USER_AGENT']; but you should filter it first incase someone has been a little nasty things exist in there.

    $pattern = '/(FacebookExternalHit|GoogleBot)/i';
    $agent = filter_input(INPUT_SERVER, 'HTTP_USER_AGENT', FILTER_SANITIZE_ENCODED);
        if(preg_match($pattern,$agent)){
          echo "found one of the patters"; 
       }
    

    A bit safer and faster code.

    0 讨论(0)
  • 2020-11-27 15:48

    Another generic approach in PHP

    $agent = $_SERVER['HTTP_USER_AGENT'];
    $agent = trim($agent);
    $agent = strtolower($agent);
    if (
    strpos($agent,'facebookexternalhit/1.1')===0
    || strpos($agent,'facebookexternalhit/1.0')===0
    ){
        //probably facebook
    }else{
        //probably not facebook
    }
    
    0 讨论(0)
  • 2020-11-27 15:50

    Facebook User-Agents are:

    FacebookExternalHit/1.1
    FacebookExternalHit/1.0
    facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
    facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
    facebookexternalhit/1.0 (+https://www.facebook.com/externalhit_uatext.php)
    facebookexternalhit/1.1 (+https://www.facebook.com/externalhit_uatext.php)
    

    I'm using the code below to detect FB User-Agent in PHP and it works as intended:

    $agent = $_SERVER['HTTP_USER_AGENT'];
    if(stristr($agent, 'FacebookExternalHit')){
        //Facebook User-Agent
    }else{
        //Other User-Agent
    }
    
    0 讨论(0)
  • 2020-11-27 15:51

    And if you want to block facebook bot from accessing your website (assuming you're using Apache) add this to your .htaccess file:

    <Limit GET POST>
    BrowserMatchNoCase "Feedfetcher-Google" feedfetcher
    BrowserMatchNoCase "facebookexternalhit" facebook
    order deny,allow
    deny from env=feedfetcher
    deny from env=facebook
    </Limit>
    

    It also blocks google's feedfetcher that also can be used for cheap DDoSing.

    0 讨论(0)
提交回复
热议问题