linear-regression | 易学教程

Reproducing Excel's LINEST function with NumPy

阅读更多关于 Reproducing Excel's LINEST function with NumPy

问题 I have to use Excel's LINEST function to compute error in my linear regression. I was hoping to reproduce the results using Numpy's polyfit function. I was hoping to reproduce the following LINEST usage: LINEST(y's, x's,,TRUE) with polyfit. I'm not sure how I can get the two functions to produce the same values because nothing I've tried gives similar results. I tried the following: numpy.polyfit(x,y,3) and various other values in the third position. 回答1: This question is actually a result of

How to get regression coefficients and model fits using correlation or covariance matrix instead of data frame using R?

阅读更多关于 How to get regression coefficients and model fits using correlation or covariance matrix instead of data frame using R?

问题 I want to be able to regression coefficients from multiple linear regression by supplying a correlation or covariance matrix instead of a data.frame. I realise you lose some information relevant to determining the intercept and so on, but it should even the correlation matrix should be sufficient for getting standardised coefficients and estimates of variance explained. So for example, if you had the following data # get some data library(MASS) data("Cars93") x <- Cars93[,c("EngineSize",

ggplot with multiple regression lines to show random effects

阅读更多关于 ggplot with multiple regression lines to show random effects

问题 I am aware of this and this posts. However, I don't seem to get the expected result when I try the following: The data can be loaded directly from here. The idea is that in a completely made-up data set, the levels of glucose in blood for several athletes at the completion of different races would depend on some fictitious amino acid (AAA): The call for the plot was: ggplot(df, aes(x = AAA, y = glucose, color=athletes)) + geom_point() + geom_smooth(method="lm", fill=NA) And I expected to get

Fitting logarithmic curve in R

阅读更多关于 Fitting logarithmic curve in R

问题 If I have a set of points in R that are linear I can do the following to plot the points, fit a line to them, then display the line: x=c(61,610,1037,2074,3050,4087,5002,6100,7015) y=c(0.401244, 0.844381, 1.18922, 1.93864, 2.76673, 3.52449, 4.21855, 5.04368, 5.80071) plot(x,y) Estimate = lm(y ~ x) abline(Estimate) Now, if I have a set of points that looks like a logarithmic curve fit is more appropriate such as the following: x=c(61,610,1037,2074,3050,4087,5002,6100,7015) y=c(0.974206,1.16716

How to manually compute the p-value of t-statistic in linear regression

阅读更多关于 How to manually compute the p-value of t-statistic in linear regression

问题 I did a linear regression for a two tailed t-test with 178 degrees of freedom. The summary function gives me two p-values for my two t-values. t value Pr(>|t|) 5.06 1.04e-06 *** 10.09 < 2e-16 *** ... ... F-statistic: 101.8 on 1 and 178 DF, p-value: < 2.2e-16 I want to calculate manually the p-value of the t-values with this formula: p = 1 - 2*F(|t|) p_value_1 <- 1 - 2 * pt(abs(t_1), 178) p_value_2 <- 1 - 2 * pt(abs(t_2), 178) I don't get the same p-values as in the model summary. Therefore, I

How to scale the x and y axis equally by log in Seaborn?

阅读更多关于 How to scale the x and y axis equally by log in Seaborn?

问题 I want to create a regplot with a linear regression in Seaborn and scale both axes equally by log, such that the regression stays a straight line. An example: import matplotlib.pyplot as plt import seaborn as sns some_x=[0,1,2,3,4,5,6,7] some_y=[3,5,4,7,7,9,9,10] ax = sns.regplot(x=some_x, y=some_y, order=1) plt.ylim(0, 12) plt.xlim(0, 12) plt.show() What I get: If I scale the x and y axis by log, I would expect the regression to stay a straight line. What I tried: import matplotlib.pyplot as

Lagged regression in R: determining the optimal lag

阅读更多关于 Lagged regression in R: determining the optimal lag

问题 I have a variable that is believed to be a good predictor for another variable, but with some lag. I don't know what the lag is and want to estimate it from the data. Here is am example: library(tidyverse) data <- tibble( id = 1:100, y = dnorm(1:100, 30, 20) * 1000, x.shifted = y / 10 + runif(100) / 10, x.actual = lag(x.shifted, 30) ) data %>% ggplot(aes(id, x.shifted)) + geom_point() + geom_point(aes(id, x.actual), color = 'blue') + geom_point(aes(id, y), color = 'red') The model lm(y ~ x

Implementing a linear regression using gradient descent

阅读更多关于 Implementing a linear regression using gradient descent

问题 I'm trying to implement a linear regression with gradient descent as explained in this article (https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931). I've followed to the letter the implementation, yet my results overflow after a few iterations. I'm trying to get this result approximately: y = -0.02x + 8499.6. The code: package main import ( "encoding/csv" "fmt" "strconv" "strings" ) const ( iterations = 1000 learningRate = 0.0001 ) func computePrice(m, x, c

How to Spread Plot's Date Axis According To Years When Plotting With Seaborn?

阅读更多关于 How to Spread Plot's Date Axis According To Years When Plotting With Seaborn?

问题 I'm trying to train a Linear Regression Model with Python via using Google Stock Prices that can be found here: https://www.kaggle.com/medharawat/google-stock-price And trying to predict future stocks by given features. After that I'm planning to plot it with the values in current dataset. First, I read dataframes with date values with date parser and concatted these 2 dataframes into one in order to split it myself: parser = lambda date: pd.datetime.strptime(date, '%m/%d/%Y') df_test=pd.read

Implementing a linear regression using gradient descent

阅读更多关于 Implementing a linear regression using gradient descent