问题
I have 2 tables. Table A has Date, ISBN (for Book), Demand(demand for that date). Table B has Date, ISBN (for Book), and SalesRank.
The sample data is as follows: The DailyBookFile has 150k records for each date, from year 2010 (i.e. 150k * 365 days * 8 years) rows. Same goes with SalesRank Table having about 500k records for each date
DailyBookFile
Date Isbn13 CurrentModifiedDemandTotal
20180122 9780955153075 13
20180122 9780805863567 9
20180122 9781138779396 1
20180122 9780029001516 9
20180122 9780470614150 42
SalesRank
importdate ISBN13 SalesRank
20180122 9780029001516 69499
20180122 9780470614150 52879
20180122 9780805863567 832429
20180122 9780955153075 44528
20180122 9781138779396 926435
Required Output
Date Avg_Rank Book_Group
20180122 385154 Elite
20180121 351545 Elite
20180120 201545 Elite
I want to get the Top 200 CurrentModifiedDemand for each day, and take the average Rank.
I am unable to work out a solution as I am new to SQL.
I started with getting the Top 200 CurrentModifiedDemand for yesterday and get the Avg Rank over last year.
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] AS AMA ON (DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate)
WHERE DBF.Isbn13 IN (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = 20180122
AND CAST(CurrentModifiedDemandTotal AS int) > 200)
AND DBF.Filedate > 20170101
GROUP BY DBF.Filedate;
But the result is not what I want. So, now I want the ISBN for the Top 200 CurrentModifiedDemand for each day and their avg rank. I tried with this.
DECLARE @i int;
SET @i = 20180122;
WHILE (SELECT DISTINCT(DBF.Filedate)
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
WHERE DBF.Filedate = @i) IS NOT NULL
BEGIN
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] as AMA ON DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate
WHERE DBF.Isbn13 in (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = @i
AND CAST (CurrentModifiedDemandTotal AS int) > 500)
AND DBF.Filedate = @i
GROUP BY DBF.Filedate;
SET @i = @i+1;
END
In this I am getting one select query result in each window. Is there any way to have the result in a single table?
P.S. The list of top 200 books every day will change according to the CurrentModifiedDemand. I want to take their avg. sales rank for that day.
回答1:
Instead of immediately selecting in each iteration of the loop, you can insert rows to temp table (or table-type variable) and select everything after the loop finishes:
IF OBJECT_ID('tempdb..#books') IS NOT NULL
BEGIN
DROP TABLE #books
END
CREATE TABLE #books (
[Date] INT,
[Avg_Rank] FLOAT,
[Book_Group] VARCHAR(512)
);
DECLARE @i int;
SET @i = 20180122;
BEGIN TRY
WHILE (SELECT DISTINCT(DBF.Filedate)
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
WHERE DBF.Filedate = @i) IS NOT NULL
BEGIN
INSERT INTO #books (
[Date],
[Avg_Rank],
[Book_Group]
)
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] as AMA ON DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate
WHERE DBF.Isbn13 in (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = @i
AND CAST (CurrentModifiedDemandTotal AS int) > 500)
AND DBF.Filedate = @i
GROUP BY DBF.Filedate;
SET @i = @i+1;
END
END TRY
BEGIN CATCH
IF OBJECT_ID('tempdb..#books') IS NOT NULL
BEGIN
DROP TABLE #books
END
END CATCH
SELECT *
FROM #books
DROP TABLE #books
Using table-type variable would yield simpler code, but when storing large amounts of data table-type variables start losing in performance against temp tables. I'm not sure how many rows is a cut-off, but in my experience I've seen significant performance gains from changing table-type var to temp table at 10000+ row counts. For small row counts an opposite might apply.
回答2:
This avoids a costly WHILE
loop, and I believe achieves your goal:
CREATE TABLE #DailyBookFile ([Date] date,
Isbn13 bigint,
CurrentModifiedDemandTotal tinyint);
INSERT INTO #DailyBookFile
VALUES ('20180122',9780955153075,13),
('20180122',9780805863567,9 ),
('20180122',9781138779396,1 ),
('20180122',9780029001516,9 ),
('20180122',9780470614150,42);
CREATE TABLE #SalesRank (importdate date,
ISBN13 bigint,
#SalesRank int);
INSERT INTO #SalesRank
VALUES ('20180122',9780029001516,69499 ),
('20180122',9780470614150,52879 ),
('20180122',9780805863567,832429),
('20180122',9780955153075,44528 ),
('20180122',9781138779396,926435);
GO
WITH Ranks AS(
SELECT SR.*,
RANK() OVER (PARTITION By SR.importdate ORDER BY SR.#SalesRank) AS Ranking
FROM #SalesRank SR
JOIN #DailyBookFile DBF ON SR.ISBN13 = DBF.Isbn13
AND SR.importdate = DBF.[Date])
SELECT importdate AS [Date],
AVG(#SalesRank) AS Avg_rank,
'Elite' AS Book_Group
FROM Ranks
WHERE Ranking <= 200
GROUP BY importdate;
GO
DROP TABLE #DailyBookFile;
DROP TABLE #SalesRank;
来源:https://stackoverflow.com/questions/48406026/multiple-select-queries-using-while-loop-in-a-single-table-is-it-possible