how to calculate prevalence using sql code

前端 未结 3 488
伪装坚强ぢ
伪装坚强ぢ 2021-01-28 04:36

I am trying to calculate prevalence in sql. kind of stuck in writing the code. I want to make automative code.

I have check that I have 1453477 of sample size and numbe

相关标签:
3条回答
  • 2021-01-28 04:47

    I am pretty sure that the logic that you want is something like this:

    select avg( (condition_id = 12345)::int )
    from disease;
    

    Your version doesn't have the sample size, because you are filtering out people without the condition.

    If you have duplicate people in the data, then this is a little more complicated. One method is:

    select (count(distinct person_id) filter (where condition_id = 12345)::numeric /
            count(distinct person_id
           )
    from disease;
    
    0 讨论(0)
  • 2021-01-28 05:08

    In your current query you count the number of rows in the disease table, once using the column condition_id, once using the column person_id. But the number of rows is the same - this is why you get 1 as a result.

    I think you need to find the number of different values for these columns. This can be done using count distinct:

    select (COUNT(DISTINCT condition_id)/COUNT(DISTINCT person_id)) as prevalence
    from disease
    where condition_id=12345;
    
    0 讨论(0)
  • 2021-01-28 05:12

    You can cast by

    count(...)/count(...)::numeric(6,4) or

    count(...)/count(...)::decimal

    as two options.

    Important point is apply cast to denominator or numerator part(in this case denominator), Do not apply to division as

    (count(...)/count(...))::numeric(6,4) which again results an integer.

    0 讨论(0)
提交回复
热议问题