埋点SQL监控

懵懂的女人 提交于 2019-12-26 00:47:43

================================================================================

背景:SQL监控接入dpc,日期为云自带的函数,但在本地odps调试时候不可以走云函数,需要自己获取当前时间-1,格式为yyyymmdd 如20191213

mysql中的DATE_FORMAT(NOW(),'%Y-%m-%d')函数

前一天日期

DATE_FORMAT(adddate(now(),-1),'%Y%m%d')

 

1.格式:
DATE_FORMAT(date,format) 函数用于显示日期或时间数据的不同样式。
1.1参数:date 合法的日期;
format 最终输出的日期/时间;
2.参考:
DATE_FORMAT(NOW(),’%Y-%m-%d’) 格式转换

SELECT DATE_FORMAT(NOW(),'%Y-%m-%d') AS '日期'

输出格式为2019-12-12 

 

如果需要20191212格式的日期,则

mysql> select date_format(current_timestamp, '%Y%m%d')

    -> ;

+------------------------------------------+

| date_format(current_timestamp, '%Y%m%d') |

+------------------------------------------+

| 20191224                                 |

+------------------------------------------+

 

该死的odps不支持mysql函数

select DATEADD(GETDATE(), -1, 'dd') from database.table_name limit 1;

select to_char(DATEADD(GETDATE(), -1, 'dd'),'yyyymmdd') from database.table_name limit 1;

但是第二个SQL不行,因为我最后要的是20191212这种的而第一个函数转出来是20191212 15:11:20这种的格式导致结果显示为yy1212

原因:''错误导致 应为“” 换成下面的就对了 

select to_char(dateadd(GETDATE(),-1,'dd'),"yyyymmdd") from database.table_name limit 1;

第二种SQL为

select  replace(split_part(dateadd(from_unixtime(unix_timestamp()),-1,'dd')," ",1),"-","") from database.table_name

思考:mysql一个函数搞定,odps需要多个函数,比较慢和烦人~~不能只用平台自带的配置调用配置ds='${bizdate}' 

1技术会废掉   2出去面试找不到工作 3不要用傻瓜式配置,离开平台将没得学习和提高,还是要多思考。本地调试写死日期不明智,因为数据表会保留一周的数据,考虑到可用性和易维护性,还是写成函数吧,不能图省事,要看的长远~

 

 

最终用到的SQL如下

===========================================================================

背景:
业务线埋点业务复杂,正则埋点130条+,基于UT平台经常断开链接以及每次回归成本较大,线上无监控,等着BI T+1发现问题的时候就晚了,对大盘数据造成影响。且QA最了解搜索埋点的不同场景的字段,基于odps做了埋点的监控S1测试方式:
1. 搜索埋点曝光case 21 条,点击case 64 条
其中,曝光case 1条为1-多个卡片,点击case1条为多个卡片(114个case)
// 统计mysql中一个字段的value中K出现的次数,如value值为 K3-22,K8-9,K10-2查看详情-二级页,查找K出现的次数,每个K是一个case

SELECT widget_name, sum(LENGTH(widget_name) - LENGTH( REPLACE(widget_name,'K','')))

from table_name where event_id=2101 and pd_emp_id in ("11","12","13","14","15" ,"16") ORDER BY gmt_modified DESC

 

1.1 曝光case校验字段:soku_test_ab、engine、item_log、aaid、k、track_info、source_from、search_from

具体规则如下,示例:
track_info 下的 soku_test_ab 正则规则 [a-z]{1}
track_info 下的 engine 正则规则 \S*
track_info 下的 item_log 正则规则 \S*
track_info 下的 k 正则规则 \S*
track_info 下的 aaid 正则规则 [0-9a-z]{32}
track_info 下的 source_from 正则规则 (home|discover|vip)
track_info 下的 search_from 正则规则 ^[1-9]\d?|1[01]\d|1[01]\d|^10trackinfo不为空spm正则规则a2h0c.8166622.PhoneXXPictureTab\d∗.channeltab\d∗(;|,)∗scm正则规则20140669.search.rgroupset.filter§∗1.2点击case校验字段:sokutestab、engine、itemlog、aaid、k、trackinfo、spm、scm结果页:trackinfo下的sokutestab正则规则[a−z]1trackinfo下的engine正则规则§∗trackinfo下的itemlog正则规则§∗trackinfo下的sourcefrom正则规则(home|discover|vip)trackinfo下的searchfrom正则规则[1−9]\d?trackinfo不为空spm正则规则a2h0c.8166622.PhoneXXPictureTab\d∗.channeltab\d∗(;|,)∗scm正则规则20140669.search.rgroupset.filter§∗1.2点击case校验字段:sokutestab、engine、itemlog、aaid、k、trackinfo、spm、scm结果页:trackinfo下的sokutestab正则规则[a−z]1trackinfo下的engine正则规则§∗trackinfo下的itemlog正则规则§∗trackinfo下的sourcefrom正则规则(home|discover|vip)trackinfo下的searchfrom正则规则[1−9]\d?|^1[01]\d|10|10
track_info 不为空
spm 正则规则 a2h0c.8166622.PhoneXXProgramSeries_\d*.poster_\d*
scm 正则规则 20140669.search.\S*.\S*

默认页:
track_info 下的 soku_test_ab 正则规则 [a-z]{1}
track_info 下的 aaid 正则规则 [0-9a-zA-Z]{32}
track_info 下的 k 正则规则 \S*
track_info 正则规则 \S*
spm 正则规则 a2h0c.8166619.PhoneXXOperate.clearbutton
scm 正则规则 20140669.search.searcharea.clearbutton

  1. 进度: 曝光埋点需要增加aaid k,进度:100% 曝光埋点新增soku_test_ab、engine、item_log,完成100%

点击埋点新增soku_test_ab、engine、item_log字段,完成100%
共85条case,128个字段

  1. 分工:
    曝光埋点
    点击埋点

  2. 测试手段:
    每个迭代版本一灰前覆盖测试所有点击 曝光埋点
    通过正则平台筛选业务线,勾选case生成测试方案,手动运行,生成测试报告(成功、失败、未运行的case)
    测试完毕QA发出测试报告

  3. 埋点涉及到的各种平台
    埋点日志平台:抓取埋点实时日志
    XX:埋点正则case地址
    FBI监控平台:关注埋点监控日报
    埋点数据监控平台
    OneData报警平台:全部点击曝光case已接入,但暂不支持正则报警,待RD完善

  4. 埋点接入正则case
    埋点监控以及报警

  5. 埋点相关数据表调研

  • 埋点所有事件表,曝光是没拆的原始日志,是离线表。实时表在特斯拉,很危险,一天上千亿数据
  • 埋点日志15分钟延迟表
  • 埋点小时表
  • 底层表T+1日志

 

 

select alldata.allcnt, faildata.failcnt, round(faildata.failcnt*100.0/alldata.allcnt,4) as failratio 

from(

    ( select count(*) as failcnt,'trackInfo' as question 

     from ( 

            SELECT a.* ,

            get_json_object(a.track_info,'$.show_q') as show_q ,

            get_json_object(a.track_info,'$.search_q') as search_q

             from

            (

                select *

                FROM xx.xx 

                WHERE ds=20191216 and site='xx' and (device='android'   ) 

                and (original_spm='a2h0c.8166622.home.default' or 

                original_spm='a2h0c.8166619.xx.default') 

                and app_version>'8.0.0'

            )a  

        )b

        WHERE 

            b.req_id is null or trim(b.req_id)=''    

            or b.show_q is null or trim(b.show_q)=''  

            or b.search_q is null or trim(b.search_q)=''  

            or b.recext is NULL or trim(b.recext)=''     

            or b.aaid is NULL  or trim(b.aaid)=''    

            or b.alginfo is null or trim(b.alginfo)=''  

    )faildata

    LEFT JOIN (    

        select count(*) as allcnt, 'trackInfo' as question

        FROM xx.xx 

        WHERE ds=20191210 and site='xx' and (device='android'  )

        and (original_spm='a2h0c.8166622.home.default' or 

                original_spm='a2h0c.8166619.xx.default' )

    )alldata

    on alldata.question=faildata.question

);

 

计算一个带条件和不带条件的比例,改为用sum(case when 带条件 then 1 else 0),count(1) 然后再把两个字段作除法

就是 

SELECT

SUM(

    CASE parent_id

WHEN 0 THEN

    '00'

WHEN 1 THEN

    '11'

ELSE

    'OTHERS'

END ) AS parent_id_new

2019-12-18日改版

select spm, count(distinct case when aaid is null then aaid else null end) over () , count(k)  ,
group by spm

sum(expo) over (partition by a.dim) expo_all

或者

a b c d

select
    *, a/all as rate
from (

    select count(if(error<>'-1',1,null)) as a
    , count(1) as all
    , count(if(error='a',1,null)) as a1
    -- select sum(if(error<>'-1',1,0)),sum(1),
    from (
        select *, 

        case when a=null then 'a' 
            when b=null then 'b'
            else '-1' 
        end as error
    ) a

)

 

优化前的SQL:

设计上:

mysql也支持聚合的时候加一些条件,不过一般都是数据分析师才会这么搞,或者BI统计的时候用。日常这么写SQL,要被DBA干死的。因为这样很消耗Mysql的cpu,计算也一般都很慢,在线业务跑这种SQL,那接口几秒钟能返回也是够快了,随便几个并发起来了,库都要被拖挂了。

业务上:SQL查询的是A or B  or C or D为空的总和计算/total_log,接入监控报警,一旦报警,无法准确定位是哪个字段出错了,可能是A可能是B可能是C,因为计算的是总和出错率。到时候还需要把SQL粘贴到odps,逐一修改判断哪个字段为空
 

优化后的SQL:

分别计算A为空B为空C为空D为空的出错率,因为最终的监控只能监控一个字段,所以需要sum(*)输出一个值接入监控系统

校验1个场景下的10个字段,统计,占比,如果哪个字段漏掉了,排查的时候,只需要把监控SQL粘贴到odps去掉sum(*),清晰地看到ABCD各自的失败率占比。哪个字段为空,准确定位

不需要分多个规则配置,不需要分端,减少冗余无效的复制粘贴以及一堆规则的填写,精简化,报错明显

 

--odps sql

--********************************************************************--

--author:东方

--create time:2019-12-24 16:13:23

--********************************************************************--

--以下SQL的as也可以去掉

-- //通用需求柏拉图必须存在的key的校验

-- public static String[] mustExistKeyOuter = {"spm", "scm"};

-- public static String[] mustExistKeyInner = {"soku_test_ab", "engine", "item_log", "aaid", "k", "source_from", "search_from"};

 

-- set odps.sql.type.system.odps2=true;

-- select date_format(CURRENT_TIMESTAMP(),"%Y%M%D") from dual;

 

select

-- 以下device,app_version,total_failrate在本地运行时为了查看详细数据,需要打开,在接入dqc时需要注释掉

-- device,app_version,

round(spm_rate +scm_rate +track_info_rate+ soku_test_ab_rate+ engine_rate+ item_log_rate+ aaid_rate+ k_rate+ source_from_rate+ search_from_rate ,4)

as total_failrate FROM (select

 

sum( case when spm is null or trim(spm)='' then 1 else 0 end) as spm_null, sum(1) as log_total,

round(sum( case when spm is null or trim(spm)='' then 1 else 0 end) /sum(1) ,4) as spm_rate,

 

sum( case when scm is null or trim(scm)='' then 1 else 0 end) as scm_null,

round(sum( case when scm is null or trim(scm)='' then 1 else 0 end) /sum(1),4) as scm_rate,

 

sum( case when track_info is null or trim(track_info)='' then 1 else 0 end) as track_info_null,

round(sum( case when track_info is null or trim(track_info)='' then 1 else 0 end) /sum(1),4) as track_info_rate,

 

sum(case when get_json_object(track_info,'$.soku_test_ab') is null or trim(get_json_object(track_info,'$.soku_test_ab'))='' then 1 else 0 end) as soku_test_ab_null,

round(sum( case when get_json_object(track_info,'$.soku_test_ab') is null or trim(get_json_object(track_info,'$.soku_test_ab'))='' then 1 else 0 end) /sum(1),4) as soku_test_ab_rate,

 

sum( case when get_json_object(track_info,'$.engine') is null or trim(get_json_object(track_info,'$.engine'))='' then 1 else 0 end) as engine_null,

round(sum( case when get_json_object(track_info,'$.engine') is null or trim(get_json_object(track_info,'$.engine'))='' then 1 else 0 end) /sum(1),4) as engine_rate,

 

sum( case when get_json_object(track_info,'$.item_log') is null or trim(get_json_object(track_info,'$.item_log'))='' then 1 else 0 end) as item_log_null,

round(sum( case when get_json_object(track_info,'$.item_log') is null or trim(get_json_object(track_info,'$.item_log'))=''then 1 else 0 end) /sum(1),4) item_log_rate,

sum( case when get_json_object(track_info,'$.aaid') is null or trim(get_json_object(track_info,'$.aaid'))='' then 1 else 0 end) as aaid_null,

round(sum( case when get_json_object(track_info,'$.aaid') is null or trim(get_json_object(track_info,'$.aaid'))='' then 1 else 0 end) /sum(1),4) as aaid_rate,

sum( case when get_json_object(track_info,'$.k') is null or trim(get_json_object(track_info,'$.k'))='' then 1 else 0 end) as k_null,

round(sum( case when get_json_object(track_info,'$.k') is null or trim(get_json_object(track_info,'$.k'))='' then 1 else 0 end) /sum(1),4) as k_rate,

sum( case when get_json_object(track_info,'$.source_from') is null or trim(get_json_object(track_info,'$.source_from'))='' then 1 else 0 end) as source_from_null,

round(sum( case when get_json_object(track_info,'$.source_from') is null or trim(get_json_object(track_info,'$.source_from'))='' then 1 else 0 end) /sum(1),4) as source_from_rate,

sum( case when get_json_object(track_info,'$.search_from') is null or trim(get_json_object(track_info,'$.search_from'))='' then 1 else 0 end) as search_from_null,

round(sum( case when get_json_object(track_info,'$.search_from') is null or trim(get_json_object(track_info,'$.search_from'))='' then 1 else 0 end) /sum(1),4) as search_from_rate

-- 以下device,app_version,在本地运行时为了查看详细数据,需要打开,在接入dqc时需要注释掉

-- ,device,app_version

from database.table_name

WHERE ds=to_char(dateadd(GETDATE(),-1,'dd'),"yyyymmdd")

-- 以下and hh in(18,19)在接入dqc时需要注释掉,监控线上一天24小时的数据。仅用于本地调试

and hh in(18,19)

-- WHERE ds=replace(split_part(dateadd(from_unixtime(unix_timestamp()),-1,'dd')," ",1),"-","")

and site='xx' and (device='android' or device='iphone' )

and app_version >= '8.3.0'

-- and (spm like '%a2h0c.8166622.rdirect%' OR spm like '%a2h0c.8166622.rmovie%')

-- and (original_spm like '%a2h0c.8166622.PhoneSokuTab%' OR original_spm like '%a2h0c.8166622.PhoneSokuCast%' OR original_spm like '%a2h0c.8166622.PhoneSokuPromote%')

 

and (original_spm like '%a2h0c.8166622.phonesokutab%' OR original_spm like '%a2h0c.8166622.phonesokucast%' OR original_spm like '%a2h0c.8166622.phonesokupromote%')

-- and (original_scm like '%20140669.search.filter%' OR original_scm like '%20140669.search.person%' OR original_scm like '%20140669.search.circle%')

-- and spm like '%a2h0c.8166622.rdirect%' and (original_spm REGEXP (.*a2h0c\.8166622\.(channeltab|portrait).*)

-- 以下GROUP BY device,app_version在本地运行时为了查看详细数据,需要打开,在接入dqc时需要注释掉

-- GROUP BY device,app_version

)

;

 

效果:各字段失败占比统计

 

--odps sql

--********************************************************************--

--author:姝昕

--create time:2019-12-24 16:13:23

--********************************************************************--

--以下SQL的as也可以去掉

-- //通用需求柏拉图必须存在的key的校验

-- public static String[] mustExistKeyOuter = {"spm", "scm"};

-- public static String[] mustExistKeyInner = {"soku_test_ab", "engine", "item_log", "aaid", "k", "source_from", "search_from"};

 

-- set odps.sql.type.system.odps2=true;

-- select date_format(CURRENT_TIMESTAMP(),"%Y%M%D") from dual;

 

select

-- 以下device,app_version,total_failrate在本地运行时为了查看详细数据,需要打开,在接入dqc时需要注释掉

-- device,app_version,

round(spm_rate +scm_rate +track_info_rate+ soku_test_ab_rate+ engine_rate+ item_log_rate+ aaid_rate+ k_rate+ source_from_rate+ search_from_rate ,4)

as total_failrate FROM (select

 

sum( case when spm is null or trim(spm)='' then 1 else 0 end) as spm_null, sum(1) as log_total,

round(sum( case when spm is null or trim(spm)='' then 1 else 0 end) /sum(1) ,4) as spm_rate,

 

sum( case when scm is null or trim(scm)='' then 1 else 0 end) as scm_null,

round(sum( case when scm is null or trim(scm)='' then 1 else 0 end) /sum(1),4) as scm_rate,

 

sum( case when track_info is null or trim(track_info)='' then 1 else 0 end) as track_info_null,

round(sum( case when track_info is null or trim(track_info)='' then 1 else 0 end) /sum(1),4) as track_info_rate,

 

sum(case when get_json_object(track_info,'$.soku_test_ab') is null or trim(get_json_object(track_info,'$.soku_test_ab'))='' then 1 else 0 end) as soku_test_ab_null,

round(sum( case when get_json_object(track_info,'$.soku_test_ab') is null or trim(get_json_object(track_info,'$.soku_test_ab'))='' then 1 else 0 end) /sum(1),4) as soku_test_ab_rate,

 

sum( case when get_json_object(track_info,'$.engine') is null or trim(get_json_object(track_info,'$.engine'))='' then 1 else 0 end) as engine_null,

round(sum( case when get_json_object(track_info,'$.engine') is null or trim(get_json_object(track_info,'$.engine'))='' then 1 else 0 end) /sum(1),4) as engine_rate,

 

sum( case when get_json_object(track_info,'$.item_log') is null or trim(get_json_object(track_info,'$.item_log'))='' then 1 else 0 end) as item_log_null,

round(sum( case when get_json_object(track_info,'$.item_log') is null or trim(get_json_object(track_info,'$.item_log'))=''then 1 else 0 end) /sum(1),4) item_log_rate,

sum( case when get_json_object(track_info,'$.aaid') is null or trim(get_json_object(track_info,'$.aaid'))='' then 1 else 0 end) as aaid_null,

round(sum( case when get_json_object(track_info,'$.aaid') is null or trim(get_json_object(track_info,'$.aaid'))='' then 1 else 0 end) /sum(1),4) as aaid_rate,

sum( case when get_json_object(track_info,'$.k') is null or trim(get_json_object(track_info,'$.k'))='' then 1 else 0 end) as k_null,

round(sum( case when get_json_object(track_info,'$.k') is null or trim(get_json_object(track_info,'$.k'))='' then 1 else 0 end) /sum(1),4) as k_rate,

sum( case when get_json_object(track_info,'$.source_from') is null or trim(get_json_object(track_info,'$.source_from'))='' then 1 else 0 end) as source_from_null,

round(sum( case when get_json_object(track_info,'$.source_from') is null or trim(get_json_object(track_info,'$.source_from'))='' then 1 else 0 end) /sum(1),4) as source_from_rate,

sum( case when get_json_object(track_info,'$.search_from') is null or trim(get_json_object(track_info,'$.search_from'))='' then 1 else 0 end) as search_from_null,

round(sum( case when get_json_object(track_info,'$.search_from') is null or trim(get_json_object(track_info,'$.search_from'))='' then 1 else 0 end) /sum(1),4) as search_from_rate

-- 以下device,app_version,在本地运行时为了查看详细数据,需要打开,在接入dqc时需要注释掉

-- ,device,app_version

from ytalgo_common.dwd_soku_wlapp_clk_h

WHERE ds=to_char(dateadd(GETDATE(),-1,'dd'),"yyyymmdd")

-- 以下and hh in(18,19)在接入dqc时需要注释掉,监控线上一天24小时的数据。仅用于本地调试

and hh in(18,19)

-- WHERE ds=replace(split_part(dateadd(from_unixtime(unix_timestamp()),-1,'dd')," ",1),"-","")

and site='youku' and (device='android' or device='iphone' )

and app_version >= '8.3.0'

-- and (spm like '%a2h0c.8166622.rdirect%' OR spm like '%a2h0c.8166622.rmovie%')

-- and (original_spm like '%a2h0c.8166622.PhoneSokuTab%' OR original_spm like '%a2h0c.8166622.PhoneSokuCast%' OR original_spm like '%a2h0c.8166622.PhoneSokuPromote%')

 

and (original_spm like '%a2h0c.8166622.phonesokutab%' OR original_spm like '%a2h0c.8166622.phonesokucast%' OR original_spm like '%a2h0c.8166622.phonesokupromote%')

-- and (original_scm like '%20140669.search.filter%' OR original_scm like '%20140669.search.person%' OR original_scm like '%20140669.search.circle%')

-- and spm like '%a2h0c.8166622.rdirect%' and (original_spm REGEXP (.*a2h0c\.8166622\.(channeltab|portrait).*)

-- 以下GROUP BY device,app_version在本地运行时为了查看详细数据,需要打开,在接入dqc时需要注释掉

-- GROUP BY device,app_version

)

;

 

效果:10个字段失败占比总和统计

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!