How to calculate Session and Session duration in Firebase Analytics raw data?

前端 未结 3 1319
抹茶落季
抹茶落季 2020-11-29 08:48

How to calculate Session Duration in Firebase analytics raw data which is linked to BigQuery?

I have used the following blog to calculate the users

相关标签:
3条回答
  • 2020-11-29 09:21

    First you need to define a session - in the following query I'm going to break a session whenever a user is inactive for more than 20 minutes.

    Now, to find all sessions with SQL you can use a trick described at https://blog.modeanalytics.com/finding-user-sessions-sql/.

    The following query finds all sessions and their lengths:

    #standardSQL
    
    SELECT app_instance_id, sess_id, MIN(min_time) sess_start, MAX(max_time) sess_end, COUNT(*) records, MAX(sess_id) OVER(PARTITION BY app_instance_id) total_sessions,
       (ROUND((MAX(max_time)-MIN(min_time))/(1000*1000),1)) sess_length_seconds
    FROM (
      SELECT *, SUM(session_start) OVER(PARTITION BY app_instance_id ORDER BY min_time) sess_id
      FROM (
        SELECT *, IF(
                    previous IS null 
                    OR (min_time-previous)>(20*60*1000*1000),  # sessions broken by this inactivity 
                    1, 0) session_start 
                    #https://blog.modeanalytics.com/finding-user-sessions-sql/
        FROM (
          SELECT *, LAG(max_time, 1) OVER(PARTITION BY app_instance_id ORDER BY max_time) previous
          FROM (
            SELECT user_dim.app_info.app_instance_id
              , (SELECT MIN(timestamp_micros) FROM UNNEST(event_dim)) min_time
              , (SELECT MAX(timestamp_micros) FROM UNNEST(event_dim)) max_time
            FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160601`
          )
        )
      )
    )
    GROUP BY 1, 2
    ORDER BY 1, 2
    

    0 讨论(0)
  • 2020-11-29 09:29

    As you know, Google has changed the schema of BigQuery firebase databases: https://support.google.com/analytics/answer/7029846

    Thanks to @Felipe answer, the new format will be changed as follow:

    SELECT SUM(total_sessions) AS Total_Sessions, AVG(sess_length_seconds) AS Average_Session_Duration
    FROM (
      SELECT user_pseudo_id, sess_id, MIN(min_time) sess_start, MAX(max_time) sess_end, COUNT(*) records, 
        MAX(sess_id) OVER(PARTITION BY user_pseudo_id) total_sessions,
        (ROUND((MAX(max_time)-MIN(min_time))/(1000*1000),1)) sess_length_seconds
      FROM (
        SELECT *, SUM(session_start) OVER(PARTITION BY user_pseudo_id ORDER BY min_time) sess_id
        FROM (
          SELECT *, IF(previous IS null OR (min_time-previous) > (20*60*1000*1000), 1, 0) session_start 
          FROM (
            SELECT *, LAG(max_time, 1) OVER(PARTITION BY user_pseudo_id ORDER BY max_time) previous
            FROM (SELECT user_pseudo_id, MIN(event_timestamp) AS min_time, MAX(event_timestamp) AS max_time
              FROM `dataset_name.table_name` GROUP BY user_pseudo_id)
          )
        )
      )
      GROUP BY 1, 2
      ORDER BY 1, 2
    )
    

    Note: change dataset_name and table_name based on your project info

    Sample result:

    0 讨论(0)
  • 2020-11-29 09:31

    With the new schema of Firebase in BigQuery, I found that the answer by @Maziar did not work for me, but I am not sure why. Instead I have used the following to calculate it, where a session is defined as a user engaging with your app for a minimum of 10 seconds and where the session stops if a user doesn't engage with the app for 30 minutes. It provides total number of sessions and the session length in minutes, and it is based on this query: https://modeanalytics.com/modeanalytics/reports/5e7d902f82de/queries/2cf4af47dba4

    SELECT COUNT(*) AS sessions,
           AVG(length) AS average_session_length
      FROM (
      
    SELECT global_session_id,
           (MAX(event_timestamp) - MIN(event_timestamp))/(60 * 1000 * 1000) AS length
      FROM (
    SELECT user_pseudo_id,
           event_timestamp,
           SUM(is_new_session) OVER (ORDER BY user_pseudo_id, event_timestamp) AS global_session_id,
           SUM(is_new_session) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS user_session_id
      FROM (
           SELECT *,
                  CASE WHEN event_timestamp - last_event >= (30*60*1000*1000) 
                         OR last_event IS NULL 
                       THEN 1 ELSE 0 END AS is_new_session
             FROM (
                  SELECT user_pseudo_id,
                         event_timestamp,
                         LAG(event_timestamp,1) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS last_event
                    FROM `dataset.events_2019*`
                  ) last
           ) final
           ) session
     GROUP BY 1
           
           ) agg
    WHERE length >= (10/60)

    0 讨论(0)
提交回复
热议问题