Scraping Real Time Visitors from Google Analytics

前端 未结 4 1129
梦谈多话
梦谈多话 2020-12-05 02:44

I have a lot of sites and want to build a dashboard showing the number of real time visitors on each of them on a single page. (would anyone else want this?) Right now the o

相关标签:
4条回答
  • 2020-12-05 03:25

    With Google Chrome I can see the data on the Network Panel.

    The request endpoint is https://www.google.com/analytics/realtime/bind

    Seems like the connection stays open for 2.5 minutes, and during this time it just keeps getting more and more data.

    After about 2.5 minutes the connection is closed and a new one is open.

    On the Network panel you can only see the data for the connections that are terminated. So leave it open for 5 minutes or so and you can start to see the data.

    I hope that can give you a place to start.

    0 讨论(0)
  • 2020-12-05 03:26

    Having google in the loop seems pretty redundant. Suggest you use a common element delivered on demand from the dashboard server and include this item by absolute URL on all pages to be monitored for a given site. The script outputting the item can read the IP of the browser asking and these can all be logged into a database and filtered for uniqueness giving a real time head count.

    <?php
    $user_ip = $_SERVER["REMOTE_ADDR"];
    /// Some MySQL to insert $user_ip to the database table for website XXX  goes here
    
    
    $file = 'tracking_image.gif';
    $type = 'image/gif';
    header('Content-Type:'.$type);
    header('Content-Length: ' . filesize($file));
    readfile($file);
    ?>
    

    Ammendum: A database can also add a timestamp to every row of data it stores. This can be used to further filter results and provide the number of visitors in the last hour or minute.

    Client side Javascript with AJAX for fine tuning or overkill The onblur and onfocus javascript commands can be used to tell if the the page is visible, pass the data back to the dashboard server via Ajax. http://www.thefutureoftheweb.com/demo/2007-05-16-detect-browser-window-focus/

    When a visitor closes a page this can also be detected by the javascript onunload function in the body tag and Ajax can be used to send data back to the server one last time before the browser finally closes the page.

    As you may also wish to collect some information about the visitor like Google analytics does this page https://panopticlick.eff.org/ has a lot of javascript that can be examined and adapted.

    0 讨论(0)
  • 2020-12-05 03:42

    To get the same, Google has launched new Real Time API. With this API you can easily retrieve real time online visitors as well as several Google Analytics with following dimensions and metrics. https://developers.google.com/analytics/devguides/reporting/realtime/dimsmets/

    This is quite similar to Google Analytics API. To start development on this, https://developers.google.com/analytics/devguides/reporting/realtime/v3/devguide

    0 讨论(0)
  • 2020-12-05 03:44

    I needed/wanted realtime data for personal use so I reverse-engineered their system a little bit.

    Instead of binding to /bind I get data from /getData (no pun intended).

    At /getData the minimum request is apparently: https://www.google.com/analytics/realtime/realtime/getData?pageId&key={{propertyID}}&q=t:0|:1

    Here's a short explanation of the possible query parameters and syntax, please remember that these are all guesses and I don't know all of them:

    Query Syntax: pageId&key=propertyID&q=dataType:dimensions|:page|:limit:filters

    Values:

    pageID: Required but seems to only be used for internal analytics.
    
    propertyID: a{{accountID}}w{{webPropertyID}}p{{profileID}}, as specified at the Documentation link below. You can also find this in the URL of all analytics pages in the UI.
    
    
    dataType:
        t: Current data
        ot: Overtime/Past
        c: Unknown, returns only a "count" value
    
    
    dimensions (| separated or alone), most values are only applicable for t:
        1:  Country
        2:  City
        3:  Location code?
        4:  Latitude
        5:  Longitude
        6:  Traffic source type (Social, Referral, etc.)
        7:  Source
        8:  ?? Returns (not set)
        9:  Another location code? longer.
        10: Page URL
        11: Visitor Type (new/returning)
        12: ?? Returns (not set)
        13: ?? Returns (not set)
        14: Medium
        15: ?? Returns "1"
    
    page:
        At first this seems to work for pagination but after further analysis it looks like it's also used to specify which of the 6 pages (Overview, Locations, Traffic Sources, Content, Events and Conversions) to return data for.
    
        For some reason 0 returns an impossibly high metrictotal
    
    limit: Result limit per page, maximum of 50
    
    filters:
        Syntax is as specified at the Documentation 2 link below except the OR is specified using | instead of a comma.6==CUSTOM;1==United%20States
    


    You can also combine multiple queries in one request by comma separating them (i.e. q=t:1|2|:1|:10,t:6|:1|:10).

    Following the above "documentation", if you wanted to build a query that requests the page URL and city of the top 10 active visitors with a traffic source type of CUSTOM located in the US you would use this URL: https://www.google.com/analytics/realtime/realtime/getData?key={{propertyID}}&pageId&q=t:10|2|:1|:10:6==CUSTOM;1==United%20States


    Documentation

    Documentation 2


    I hope that my answer is readable and (although it's a little late) sufficiently answers your question and helps others in the future.

    0 讨论(0)
提交回复
热议问题