问题
I would like to have a total of specific string in the field, for this example in the PAYROLL_PAYMONTH
field. For example, I will count the number of 'HELLO' in the field and display it in a group.
-- DATA
EMP_SURNAME PAYROLL_PAYYEAR PAYROLL_PAYMONTH
X 1999 JAN
X 1999 JAN
X 1999 FEB
-- OUTPUT
EMP_SURNAME PAYROLL_PAYYEAR JAN FEB MAR
X 1999 2 1 0
For counting the same string in a field and display it I made a group select procedure in Firebird 3 using SQL Manager for Firebird
CREATE PROCEDURE PAID_LISTING(
SORT_PAYROLL_YEAR VARCHAR(50) CHARACTER SET ISO8859_1 COLLATE ISO8859_1)
RETURNS(
EMP_SURNAME VARCHAR(50) CHARACTER SET ISO8859_1 COLLATE ISO8859_1,
PAYROLL_PAYYEAR VARCHAR(50) CHARACTER SET ISO8859_1 COLLATE ISO8859_1,
PAYROLL_MON_JAN VARCHAR(50) CHARACTER SET ISO8859_1 COLLATE ISO8859_1)
AS
BEGIN
FOR
SELECT
B.EMP_SURNAME,
A.PAYROLL_PAYYEAR,
COUNT (A.PAYROLL_PAYMONTH)
FROM PAYROLL A, EMP B
WHERE A.EMP_PK = B.EMP_PK AND A.PAYROLL_YEAR =: SORT_PAYROLL_YEAR
GROUP BY
B.EMP_SURNAME,
A.PAYROLL_PAYYEAR
ORDER BY B.EMP_SURNAME ASC
INTO
:EMP_SURNAME,
:PAYROLL_PAYYEAR,
:PAYROLL_MON_JAN
DO
BEGIN
SUSPEND;
END
END;
but it is not the result I want. What to do next?
回答1:
What you want to get is called "cross-table report" - https://en.wikipedia.org/wiki/Crosstab
A normal way to generate it is split in two steps:
You make a usual 1-directional query in the database, with fixed column count, with columns being semantically different, not repeating one another. So, years and months - JAN, FEB, etc - would go in different rows, not in adjacent columns.
then you represent the results of that 1D query in the desired 2-directional table by the client application means. What language and libraries you make your client application with they should provide means to make cross-tables out of regular 1D queries.
The thing is, the database is a tool to keep and extract data, not to make it eye-candy. And your client application is a tool to render the data in the ways easy and nice to look at. "Divide et empera", use every tool for the task it was created and optimized for. Forcing SQL servers into doing visual representation, while possible, would be a "quest for glory" both unnatural and relatively slow.
However if you intend to implement it in pure SQL regardless inefficiency of it then you can use CTE's for it.
- When to use Common Table Expression (CTE)
- https://www.essentialsql.com/introduction-common-table-expressions-ctes/
- https://en.wikipedia.org/wiki/Common_table_expressions
Again, "divide et empera", split your complex task into smaller simpler ones. I will work with the sample data you put in your question.
CREATE TABLE DATA (
EMP_SURNAME VARCHAR(10) NOT NULL,
PAYROLL_PAYYEAR SMALLINT NOT NULL,
PAYROLL_PAYMONTH CHAR(3) NOT NULL);
/*
EMP_SURNAME PAYROLL_PAYYEAR PAYROLL_PAYMONTH
X 1999 JAN
X 1999 FEB
X 1999 JAN
*/
You have to make three steps.
Fold the data - count the per-month rows. This is the usual
GROUP BY
query and normally it would be the only one, as the cross-tabbing would be done by your application out of its results.Make the "skeleton" list of the rows your result table would contain. Here that means - all pairs PERSON+YEAR for which there is any data. This would skip any year where there is no data for not a single month.
Enforce those queries results together and make them align horizontally, column by column, instead of normal for SQL row-under-row structure.
Here we go.
Step 1:
select EMP_SURNAME, PAYROLL_PAYYEAR, PAYROLL_PAYMONTH, Count(*) as QTY
from DATA
group by EMP_SURNAME, PAYROLL_PAYYEAR, PAYROLL_PAYMONTH
EMP_SURNAME PAYROLL_PAYYEAR PAYROLL_PAYMONTH QTY
X 1999 FEB 1
X 1999 JAN 2
Step 2:
select distinct EMP_SURNAME, PAYROLL_PAYYEAR from DATA
EMP_SURNAME PAYROLL_PAYYEAR
X 1999
Step 3:
with EMP_YEAR as ( select distinct EMP_SURNAME, PAYROLL_PAYYEAR from DATA )
,GROUPED as
(
select EMP_SURNAME, PAYROLL_PAYYEAR, PAYROLL_PAYMONTH, Count(*) as QTY
from DATA group by EMP_SURNAME, PAYROLL_PAYYEAR, PAYROLL_PAYMONTH
)
select EMP_YEAR.EMP_SURNAME, EMP_YEAR.PAYROLL_PAYYEAR
,coalesce( emp_jan.qty, 0) as JAN
,coalesce( emp_feb.qty, 0) as FEB
,coalesce( emp_mar.qty, 0) as MAR
from EMP_YEAR
left join GROUPED as EMP_JAN on
EMP_YEAR.EMP_SURNAME = EMP_JAN.EMP_SURNAME and
EMP_YEAR.PAYROLL_PAYYEAR = EMP_JAN.PAYROLL_PAYYEAR and
EMP_JAN.PAYROLL_PAYMONTH = 'JAN'
left join GROUPED as EMP_FEB on
EMP_YEAR.EMP_SURNAME = EMP_FEB.EMP_SURNAME and
EMP_YEAR.PAYROLL_PAYYEAR = EMP_FEB.PAYROLL_PAYYEAR and
EMP_FEB.PAYROLL_PAYMONTH = 'FEB'
left join GROUPED as EMP_MAR on
EMP_YEAR.EMP_SURNAME = EMP_MAR.EMP_SURNAME and
EMP_YEAR.PAYROLL_PAYYEAR = EMP_MAR.PAYROLL_PAYYEAR and
EMP_MAR.PAYROLL_PAYMONTH = 'MAR'
...and here is what you wanted to get:
EMP_SURNAME PAYROLL_PAYYEAR JAN FEB MAR
X 1999 2 1 0
Now, this query is ugly, it is fragile (lot of copy-paste where you can easily make mistake and then it would be harder to spot it), and - it is slow. Just see the query plan for this request - you join the table with itself again and again for every column!
PLAN JOIN (JOIN (JOIN (SORT (EMP_YEAR DATA NATURAL), SORT (EMP_JAN DATA NATURAL)), SORT (EMP_FEB DATA NATURAL)), SORT (EMP_MAR DATA NATURAL))
So... that is how you can do it in SQL servers, but think again and try to distribute tasks between proper tools, thus only do group-by query #1 on the server, and make your client application regrouping it into cross-table report instead.
PS. wrapping this query into a stored procedure would not be a good idea in Firebird. Procedures and functions are for programming. If you want just to keep a complex query as a named SQL object - that is what SQL VIEWs are for.
create view CTE_CROSSTAB (EMP_SURNAME,PAYROLL_PAYYEAR,JAN,FEB,MAR) as
with EMP_YEAR as ( select distinct EMP_SURNAME, PAYROLL_PAYYEAR from data )
......etc
来源:https://stackoverflow.com/questions/48262968/crosstab-counting-the-same-string-in-a-field-and-display-it-as-field-name