What is the difference between lm(offense$R ~ offense$OBP) and lm(R ~ OBP)?

前端 未结 2 1856
青春惊慌失措
青春惊慌失措 2020-12-20 20:23

I am trying to use R to create a linear model and use that to predict some values. The subject matter is baseball stats. If I do this:

obp <- lm(offense         


        
相关标签:
2条回答
  • 2020-12-20 21:01

    In the first case, you get this if you print the model:

    Call:
    lm(formula = offense$R ~ offense$OBP)
    
    Coefficients:
    (Intercept)  offense$OBP  
        -0.1102       0.5276 
    

    But in the second, you get this:

    Call:
    lm(formula = R ~ OBP)
    
    Coefficients:
    (Intercept)          OBP  
        -0.1102       0.5276  
    

    Look at the name of the coefficients. When you create your newdata with newdata=data.frame(OBP=0.5), that not really make sense for the first model, so newdata is ignored and you only get the predicted values with the training data. When you use offense$R ~ offense$OBP, the formula has just two vectors at each side, with no names associated to a data.frame.

    The best way to do it is:

    obp = lm(R ~ OBP, data=offense)
    predict(obp, newdata=data.frame(OBP=0.5), interval="predict")
    

    And you'll get the proper result, the prediction for OBP=0.5.

    0 讨论(0)
  • 2020-12-20 21:02

    There is no difference---you get the same coefficients.

    But some programming styles are better than others -- and attach is to be avoided, as is the more verbose first form.

    Most experienced users do

     lm(R ~ OBP, offense)
    

    instead.

    0 讨论(0)
提交回复
热议问题