This question already has an answer here:
I am trying to find the CHI-SQUARE TEST on the following set of data in the table. I am trying my this Query to find the CHI-SQUARE TEST:
SELECT sessionnumber, sessioncount, timespent,
(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected
FROM (SELECT sessionnumber, SUM(cast(cnt as bigint)) as cnt
FROM d3
GROUP BY sessionnumber) dim1 CROSS JOIN
(SELECT sessioncount, SUM(cast(cnt as bigint)) as cnt
FROM d3
GROUP BY sessioncount) dim2 CROSS JOIN
(SELECT timespent, SUM(cast(cnt as bigint)) as cnt
FROM d3
GROUP BY timespent) dim3 CROSS JOIN
(SELECT SUM(cast(cnt as bigint)) as cnt FROM d3) dimall
The sample data is:
sessionnumber sessioncount timespent cnt
1 17 28 45
2 22 8 30
3 1 1 2
4 1 1 2
5 8 111 119
6 8 65 73
7 11 5 16
8 1 1 2
9 62 64 126
10 6 42 48
But it gives me wrong output for the chi-square test values the output it gives is:
sessionnumber sessioncount timespent expected
1 23 1 0
2 23 1 0
3 23 1 0
4 23 1 0
5 23 1 0
6 23 1 0
7 23 1 0
8 23 1 0
9 23 1 0
10 23 1 0
I have tried my best and searched alot about this problem. Please do me a favor and kindly fix the problem! Thanks in advance!
integer math, cast dimall.cnt to a decimal or numeric or do the following
/(dimall.cnt* 1.00)* (dimall.cnt * 1.00)
another example to explain what actually happens
select 3/2 -- output = 1, integer math, result is an integer
select 3/2.00 -- output = 1.50
Because you are already doing casts in your calculations, you might as well cast to float
instead of bigint
SELECT sessionnumber, sessioncount, timespent,
(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected
FROM (SELECT sessionnumber, SUM(cast(cnt as float)) as cnt
FROM d3
GROUP BY sessionnumber) dim1 CROSS JOIN
(SELECT sessioncount, SUM(cast(cnt as float)) as cnt
FROM d3
GROUP BY sessioncount) dim2 CROSS JOIN
(SELECT timespent, SUM(cast(cnt as float)) as cnt
FROM d3
GROUP BY timespent) dim3 CROSS JOIN
(SELECT SUM(cast(cnt as float)) as cnt FROM d3) dimall;
float
has something like 16 digits of precision, so it should be adequate for counting any reasonable number of objects in the known universe.
来源:https://stackoverflow.com/questions/18043311/sql-query-for-chi-square-test