问题
Using GA BigQuery data, I am trying to calculate the total pageviews across 3 dimensions: date, device category, and a custom dimension (called "type" here).
So the desired output is:
So the total pageviews should be listed for each date, device, and type combination.
I used the following query to get this result. I need to unnest the "type" dimension because it is a custom dimension.
#standardsql
SELECT date, device, cd6_type, SUM(pvs) AS pageviews
FROM(
SELECT
date,
fullvisitorID,
visitID,
totals.pageviews AS pvs,
device.deviceCategory AS device
, MAX(IF(hcd.index = 6, hcd.value, NULL)) AS cd6_type
FROM `ga360-173318.62903073.ga_sessions_*` AS t,
UNNEST (t.hits) AS h,
UNNEST (h.customDimensions) AS hcd
WHERE _table_suffix BETWEEN (SELECT FORMAT_DATE('%Y%m%d', '2019-07-08'))
AND (SELECT FORMAT_DATE('%Y%m%d', '2019-07-08'))
AND h.type = "PAGE"
GROUP BY
date,
fullVisitorID,
visitID,
totals.pageviews,
device
)
GROUP BY date, device, cd6_type
The problem is that my results do not match what appears in GA; the query returns fewer results. In GA, the above results are:
- 180,812 mobile, Type A pageviews (compared to 149,149 in GBQ)
- 30,949 tablet, Type A pageviews (compared to 16,863 in GBQ)
I'm not sure why they don't match across the 2 systems, and am wondering how others calculate total pageviews across dimensions.
回答1:
You're cross joining with customdimensions
, so you're not counting pages, but custom dimensions on pages. Just don't do this cross join, you don't need it if you get your custom dimension using a subquery.
#standardsql
SELECT
date,
device.deviceCategory AS device
,(SELECT hcd.value FROM h.customdimensions AS hcd WHERE hcd.index = 6 ) AS cd6_type
,COUNT(1) as pageviews
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*` AS t,
UNNEST(t.hits) AS h
WHERE _table_suffix between '20170801' and '20170801'
AND h.type = "PAGE"
GROUP BY date, device, cd6_type
来源:https://stackoverflow.com/questions/56992924/ga-bigquery-calculating-pageviews-with-a-custom-dimension