If I have a table like this:
pkey age
---- ---
1 8
2 5
3 12
4 12
5 22
I can \"group by\" to get a count
add an age_range table and an age_range_id field to your table and group by that instead.
// excuse the DDL but you should get the idea
create table age_range(
age_range_id tinyint unsigned not null primary key,
name varchar(255) not null);
insert into age_range values
(1, '18-24'),(2, '25-34'),(3, '35-44'),(4, '45-54'),(5, '55-64');
// again excuse the DML but you should get the idea
select
count(*) as counter, p.age_range_id, ar.name
from
person p
inner join age_range ar on p.age_range_id = ar.age_range_id
group by
p.age_range_id, ar.name order by counter desc;
You can refine this idea if you like - add from_age to_age columns in the age_range table etc - but i'll leave that to you.
hope this helps :)
My approach:
select range, count(1) from (
select case
when age < 5 then '0-4'
when age < 10 then '5-9'
when age < 15 then '10-14'
when age < 20 then '15-20'
when age < 30 then '21-30'
when age < 40 then '31-40'
when age < 50 then '41-50'
else '51+'
end
as range from
(select round(extract(day from feedback_update_time - feedback_time), 1) as age
from txn_history
) ) group by range
What you are looking for, is basically the data for a histogram.
You would have the age (or age-range) on the x-axis and the count n (or frequency) on the y-axis.
In the simplest form, one could simply count the number of each distinct age value like you already described:
SELECT age, count(*)
FROM tbl
GROUP BY age
When there are too many different values for the x-axis however, one may want to create groups (or clusters or buckets). In your case, you group by a constant range of 10.
We can avoid writing a WHEN ... THEN
line for each range - there could be hundreds if it were not about age. Instead, the approach by @MatthewFlaschen is preferable for the reasons mentioned by @NitinMidha.
Now let's build the SQL...
First, we need to split the ages into range-groups of 10 like so:
This can be achieved by dividing the age column by 10 and then calculating the result's FLOOR:
FLOOR(age/10)
"FLOOR returns the largest integer equal to or less than n" http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions067.htm#SQLRF00643
Then we take the original SQL and replace age with that expression:
SELECT FLOOR(age/10), count(*)
FROM tbl
GROUP BY FLOOR(age/10)
This is OK, but we cannot see the range, yet. Instead we only see the calculated floor values which are 0, 1, 2 ... n
.
To get the actual lower bound, we need to multiply it with 10 again so we get 0, 10, 20 ... n
:
FLOOR(age/10) * 10
We also need the upper bound of each range which is lower bound + 10 - 1 or
FLOOR(age/10) * 10 + 10 - 1
Finally, we concatenate both into a string like this:
TO_CHAR(FLOOR(age/10) * 10) || '-' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1)
This creates '0-9', '10-19', '20-29'
etc.
Now our SQL looks like this:
SELECT
TO_CHAR(FLOOR(age/10) * 10) || ' - ' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1),
COUNT(*)
FROM tbl
GROUP BY FLOOR(age/10)
Finally, apply an order and nice column aliases:
SELECT
TO_CHAR(FLOOR(age/10) * 10) || ' - ' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1) AS range,
COUNT(*) AS frequency
FROM tbl
GROUP BY FLOOR(age/10)
ORDER BY FLOOR(age/10)
However, in more complex scenarios, these ranges might not be grouped into constant chunks of size 10, but need dynamical clustering. Oracle has more advanced histogram functions included, see http://docs.oracle.com/cd/E16655_01/server.121/e15858/tgsql_histo.htm#TGSQL366
Credits to @MatthewFlaschen for his approach; I only explained the details.
I had to get a count of samples by day. Inspired by @Clarkey I used TO_CHAR to extract the date of sample from the timestamp to an ISO-8601 date format and used that in the GROUP BY and ORDER BY clauses. (Further inspired, I also post it here in case it is useful to others.)
SELECT
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD') AS TS_DAY,
COUNT(*)
FROM
TABLE X
GROUP BY
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD')
ORDER BY
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD') ASC
/
Here is a solution which creates a "range" table in a sub-query and then uses this to partition the data from the main table:
SELECT DISTINCT descr
, COUNT(*) OVER (PARTITION BY descr) n
FROM age_table INNER JOIN (
select '1-10' descr, 1 rng_start, 10 rng_stop from dual
union (
select '11-20', 11, 20 from dual
) union (
select '20+', 21, null from dual
)) ON age BETWEEN nvl(rng_start, age) AND nvl(rng_stop, age)
ORDER BY descr;
Try:
select to_char(floor(age/10) * 10) || '-'
|| to_char(ceil(age/10) * 10 - 1)) as age,
count(*) as n from tbl group by floor(age/10);