发表新帖

发表新帖

how to calculate prevalence using sql code

前端未结

关注

 3  503

伪装坚强ぢ

I am trying to calculate prevalence in sql. kind of stuck in writing the code. I want to make automative code.

I have check that I have 1453477 of sample size and numbe

相关标签:

3条回答

情歌与酒

2021-01-28 04:47
I am pretty sure that the logic that you want is something like this:
```
select avg( (condition_id = 12345)::int )
from disease;
```
Your version doesn't have the sample size, because you are filtering out people without the condition.

If you have duplicate people in the data, then this is a little more complicated. One method is:
```
select (count(distinct person_id) filter (where condition_id = 12345)::numeric /
        count(distinct person_id
       )
from disease;
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2021-01-28 05:08
In your current query you count the number of rows in the disease table, once using the column condition_id, once using the column person_id. But the number of rows is the same - this is why you get 1 as a result.

I think you need to find the number of different values for these columns. This can be done using count distinct:
```
select (COUNT(DISTINCT condition_id)/COUNT(DISTINCT person_id)) as prevalence
from disease
where condition_id=12345;
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
无人共我

2021-01-28 05:12

You can cast by

count(...)/count(...)::numeric(6,4) or

count(...)/count(...)::decimal

as two options.

Important point is apply cast to denominator or numerator part(in this case denominator), Do not apply to division as

(count(...)/count(...))::numeric(6,4) which again results an integer.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题