need help adding column to one table using function that does arithmetic operations between columns from two separate tables

放肆的年华 提交于 2019-12-23 22:07:24

问题


I am trying to add a column "wOBA" to a table "starting_pitcher_stats" in MySQL using sequel-pro. Below is the code for a function that performs arithmetic operations on nine variables in the "starting_pitcher_stats" table. In particular, the function gathers the values for a number of variables, applies different weights (coefficients) to some of them (numerator below) and divides that sum by the addition and subtraction of a few more variables. All of these variables reside in the "starting_pitcher_stats" table. The arithmetic operations are expressed in the below formula (the coefficients are the values that are multiplied by each of the variables in numerator below):

wOBA=(.69*walks_a + .72*HBP + .89*singles_a + 1.27*doubles_a + 1.62*triples_a+ 2.10*HR_a)/(at_bats+walks_a+SF+HBP)

Each weight varies by year. The different weights for each year come from the table "GUTS".

The first dilemma I'm having is getting the correct code for the function. The second is the correct code syntax to use to actually call up this function and populate the new column with correct weighted wOBA values for each game of each year(season) for each "Starting_Pitcher".

The function is created with the code below and is listed as a function "wOBA" in my list of functions and procedures. However, the little wheel/knob next to the function name in sequel pro is greyed-out for some reason. It's not until I find out the correct code to invoke it, that I'll know if there are any errors.

Please ask for any more information that I can provide to clarify anything.

Thank you in advance.

DELIMITER $$
    CREATE FUNCTION wOBA(wOBA DECIMAL(10,3))
    RETURNS DECIMAL(10,3)
    BEGIN
        DECLARE wOBA decimal(10,3);
        SET wOBA = 0;
        SELECT cast((SELECT SUM(weighted_BB) as wBB_sum 
            FROM (
                SELECT g.wBB*SUM(if(e.event_CD=14,1,0)) as weighted_BB 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as walks_a)  
            + (SELECT SUM(weighted_HBP) as wHBP_sum 
            FROM (
                SELECT g.wHBP*SUM(if(e.event_CD=16,1,0)) as weighted_HBP 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as HBP)     
            + (SELECT SUM(weighted_1B) as w1B_sum 
            FROM (
                SELECT g.w1B*SUM(if(e.event_CD=20,1,0)) as weighted_1B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as singles_a)       
            + (SELECT SUM(weighted_2B) as w2B_sum 
            FROM ( 
                SELECT g.w2B*SUM(if(e.event_CD=21,1,0)) as weighted_2B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as doubles_a)       
            + (SELECT SUM(weighted_3B) as w3B_sum 
            FROM (
                SELECT g.w3B*SUM(if(e.event_CD=22,1,0)) as weighted_3B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as triples_a)       
            + (SELECT SUM(weighted_HR) as wHR_sum 
            FROM (
                SELECT g.wHR*SUM(if(e.event_CD=23,1,0)) as weighted_HR 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as HR_a) as decimal(10,3))
            /
            cast(SUM(if(e.ab_fl="T",1,0)) 
                + SUM(if(e.event_cd=14,1,0)) 
                + SUM(if(e.SF_fl="T",1,0)) 
                + SUM(if(e.event_cd=16,1,0)) as unsigned) INTO wOBA 
            FROM events e
            WHERE e.PIT_ID=Starting_Pitcher AND PIT_START_FL = "T"
            LIMIT 1;
        RETURN wOBA;
    END
    $$
    DELIMITER ;

Darwin, Here are two screen shots of the events table. The first is of the structure and the second is some of the content (as not all of the content fits into the shot):

[

Here are screenshots of the structure and contents of the GUTS table.

Here is a screenshot of the events table structure showing the fields (and their definitions)in the function:

Update:

UPDATE retrosheet.starting_pitcher_stats 
SET starting_pitcher_stats.wOBA =(SELECT
(
   (g.wBB * SUM(IF(e.event_cd = 14, 1, 0)))
   + (g.wHBP * SUM(IF(e.event_cd = 16, 1, 0)))
   + (g.w1B  * SUM(IF(e.event_cd = 20, 1, 0)))
   + (g.w2B  * SUM(IF(e.event_cd = 21, 1, 0)))
   + (g.w3B  * SUM(IF(e.event_cd = 22, 1, 0)))
   + (g.wHR  * SUM(IF(e.event_cd = 23, 1, 0)))
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 14, 1, 0))
   + SUM(IF(e.sf_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 16, 1, 0))
  ) AS wOBA
  FROM events AS e, GUTS AS g
  WHERE e.YEAR_ID = g.SEASON_ID
    AND e.PIT_START_FL= 'T'
    AND e.PIT_ID = Starting_Pitcher)

The query just keeps running. I'll keep tweaking it.

update: screenshots of starting_pitcher_stats table

Update:

Ok, I'm trying to just create a wOBA column as part of a new table with columns for the other components of wOBA.

But, the query just goes on forever. How can I make the run-time shorter?

DROP TABLE IF EXISTS starting_pitcher_wOBA;
CREATE TABLE starting_pitcher_wOBA 
SELECT
a.YEAR_ID
,
a.GAME_ID
,
a.PIT_ID
,
a.wBB
,
a.wHBP
,
a.w1B
,
a.w2B
,
a.w3B
,
a.wHR
,
a.u_walks_a
,
a.HBP
,
a.singles_a
,
a.doubles_a
,
a.triples_a
,
a.HR_a
,
a.at_bats
,
a.sacrifice_flies_a
,
a.wOBA
FROM
(
SELECT 
g.YEAR_ID
,
h.GAME_ID
,
e.PIT_ID
,
g.wBB
,
g.wHBP
,
g.w1B
,
g.w2B
,
g.w3B
,
g.wHR
,
SUM(IF(e.event_cd = 14, 1, 0)) AS u_walks_a
,
SUM(IF(e.event_cd = 16, 1, 0)) AS HBP
,
SUM(IF(e.event_cd = 20, 1, 0)) AS singles_a
,
SUM(IF(e.event_cd = 21, 1, 0)) AS doubles_a
,
SUM(IF(e.event_cd = 22, 1, 0)) AS triples_a
,
SUM(IF(e.event_cd = 23, 1, 0)) AS HR_a
,
SUM(IF(e.ab_fl = 'T',   1, 0)) AS at_bats
,
SUM(IF(e.sf_fl = 'T',   1, 0)) AS sacrifice_flies_a
,
(
(
   (g.wBB * SUM(IF(e.event_cd = 14, 1, 0))) 
   + (g.wHBP * SUM(IF(e.event_cd = 16, 1, 0))) 
   + (g.w1B  * SUM(IF(e.event_cd = 20, 1, 0))) 
   + (g.w2B  * SUM(IF(e.event_cd = 21, 1, 0))) 
   + (g.w3B  * SUM(IF(e.event_cd = 22, 1, 0))) 
   + (g.wHR  * SUM(IF(e.event_cd = 23, 1, 0))) 
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0)) 
   + SUM(IF(e.event_cd = 14, 1, 0)) 
   + SUM(IF(e.sf_fl = 'T',   1, 0)) 
   + SUM(IF(e.event_cd = 16, 1, 0)) 
  ) 
 )  AS wOBA
FROM events AS e, GUTS AS g, game AS h
WHERE e.PIT_START_FL= 'T' 
GROUP BY g.YEAR_ID, h.GAME_ID,e.PIT_ID) AS a
INNER JOIN GUTS AS g
ON 
a.YEAR_ID=g.YEAR_ID
INNER JOIN game AS h
ON
a.GAME_ID = h.GAME_ID
INNER JOIN events AS e
ON
a.PIT_ID = e.PIT_ID

回答1:


We'll start by cleaning up the query. You should always try to perform your calculations across each row when possible rather than performing multiple vertical subqueries, as this avoids the DBMS making multiple passes over the same table.

SELECT
  (
   ( (g.wbb  * SUM(IF(e.event_cd = 14, 1, 0)))
   + (g.whbp * SUM(IF(e.event_cd = 16, 1, 0)))
   + (g.w1b  * SUM(IF(e.event_cd = 20, 1, 0)))
   + (g.w2b  * SUM(IF(e.event_cd = 21, 1, 0)))
   + (g.w3b  * SUM(IF(e.event_cd = 22, 1, 0)))
   + (g.whr  * SUM(IF(e.event_cd = 23, 1, 0)))
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 14, 1, 0))
   + SUM(IF(e.sf_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 16, 1, 0))
   )
  ) AS woba
  FROM events e, guts g
  WHERE e.year_id = g.season_id
    AND e.pit_start_fl = 'T'
    AND e.pit_id = starting_pitcher
  GROUP BY g.season;

Assuming I haven't dropped a comma somewhere, this will return a column woba for each year for the specified starting pitcher.

Note that I've joined the tables on e.year_id instead of SUBSTRING(e.game_ID,4,4); this avoids the overhead of calling SUBSTRING() on each record. This sort of thing seems minor, but it can add up quickly over a large table.

That should be enough to get you started.



来源:https://stackoverflow.com/questions/34538685/need-help-adding-column-to-one-table-using-function-that-does-arithmetic-operati

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!