linear-regression

Reproducing Excel's LINEST function with NumPy

≯℡__Kan透↙ 提交于 2021-02-08 04:45:54
问题 I have to use Excel's LINEST function to compute error in my linear regression. I was hoping to reproduce the results using Numpy's polyfit function. I was hoping to reproduce the following LINEST usage: LINEST(y's, x's,,TRUE) with polyfit. I'm not sure how I can get the two functions to produce the same values because nothing I've tried gives similar results. I tried the following: numpy.polyfit(x,y,3) and various other values in the third position. 回答1: This question is actually a result of

How to get regression coefficients and model fits using correlation or covariance matrix instead of data frame using R?

你离开我真会死。 提交于 2021-02-07 18:18:25
问题 I want to be able to regression coefficients from multiple linear regression by supplying a correlation or covariance matrix instead of a data.frame. I realise you lose some information relevant to determining the intercept and so on, but it should even the correlation matrix should be sufficient for getting standardised coefficients and estimates of variance explained. So for example, if you had the following data # get some data library(MASS) data("Cars93") x <- Cars93[,c("EngineSize",

ggplot with multiple regression lines to show random effects

对着背影说爱祢 提交于 2021-02-07 07:27:19
问题 I am aware of this and this posts. However, I don't seem to get the expected result when I try the following: The data can be loaded directly from here. The idea is that in a completely made-up data set, the levels of glucose in blood for several athletes at the completion of different races would depend on some fictitious amino acid (AAA): The call for the plot was: ggplot(df, aes(x = AAA, y = glucose, color=athletes)) + geom_point() + geom_smooth(method="lm", fill=NA) And I expected to get

Fitting logarithmic curve in R

余生长醉 提交于 2021-02-07 06:57:48
问题 If I have a set of points in R that are linear I can do the following to plot the points, fit a line to them, then display the line: x=c(61,610,1037,2074,3050,4087,5002,6100,7015) y=c(0.401244, 0.844381, 1.18922, 1.93864, 2.76673, 3.52449, 4.21855, 5.04368, 5.80071) plot(x,y) Estimate = lm(y ~ x) abline(Estimate) Now, if I have a set of points that looks like a logarithmic curve fit is more appropriate such as the following: x=c(61,610,1037,2074,3050,4087,5002,6100,7015) y=c(0.974206,1.16716

How to manually compute the p-value of t-statistic in linear regression

让人想犯罪 __ 提交于 2021-02-04 17:36:09
问题 I did a linear regression for a two tailed t-test with 178 degrees of freedom. The summary function gives me two p-values for my two t-values. t value Pr(>|t|) 5.06 1.04e-06 *** 10.09 < 2e-16 *** ... ... F-statistic: 101.8 on 1 and 178 DF, p-value: < 2.2e-16 I want to calculate manually the p-value of the t-values with this formula: p = 1 - 2*F(|t|) p_value_1 <- 1 - 2 * pt(abs(t_1), 178) p_value_2 <- 1 - 2 * pt(abs(t_2), 178) I don't get the same p-values as in the model summary. Therefore, I

How to scale the x and y axis equally by log in Seaborn?

人走茶凉 提交于 2021-02-04 07:53:01
问题 I want to create a regplot with a linear regression in Seaborn and scale both axes equally by log, such that the regression stays a straight line. An example: import matplotlib.pyplot as plt import seaborn as sns some_x=[0,1,2,3,4,5,6,7] some_y=[3,5,4,7,7,9,9,10] ax = sns.regplot(x=some_x, y=some_y, order=1) plt.ylim(0, 12) plt.xlim(0, 12) plt.show() What I get: If I scale the x and y axis by log, I would expect the regression to stay a straight line. What I tried: import matplotlib.pyplot as

Lagged regression in R: determining the optimal lag

ε祈祈猫儿з 提交于 2021-01-29 20:18:57
问题 I have a variable that is believed to be a good predictor for another variable, but with some lag. I don't know what the lag is and want to estimate it from the data. Here is am example: library(tidyverse) data <- tibble( id = 1:100, y = dnorm(1:100, 30, 20) * 1000, x.shifted = y / 10 + runif(100) / 10, x.actual = lag(x.shifted, 30) ) data %>% ggplot(aes(id, x.shifted)) + geom_point() + geom_point(aes(id, x.actual), color = 'blue') + geom_point(aes(id, y), color = 'red') The model lm(y ~ x

Implementing a linear regression using gradient descent

被刻印的时光 ゝ 提交于 2021-01-29 20:00:35
问题 I'm trying to implement a linear regression with gradient descent as explained in this article (https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931). I've followed to the letter the implementation, yet my results overflow after a few iterations. I'm trying to get this result approximately: y = -0.02x + 8499.6. The code: package main import ( "encoding/csv" "fmt" "strconv" "strings" ) const ( iterations = 1000 learningRate = 0.0001 ) func computePrice(m, x, c

How to Spread Plot's Date Axis According To Years When Plotting With Seaborn?

橙三吉。 提交于 2021-01-29 19:40:05
问题 I'm trying to train a Linear Regression Model with Python via using Google Stock Prices that can be found here: https://www.kaggle.com/medharawat/google-stock-price And trying to predict future stocks by given features. After that I'm planning to plot it with the values in current dataset. First, I read dataframes with date values with date parser and concatted these 2 dataframes into one in order to split it myself: parser = lambda date: pd.datetime.strptime(date, '%m/%d/%Y') df_test=pd.read

Implementing a linear regression using gradient descent

故事扮演 提交于 2021-01-29 15:44:41
问题 I'm trying to implement a linear regression with gradient descent as explained in this article (https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931). I've followed to the letter the implementation, yet my results overflow after a few iterations. I'm trying to get this result approximately: y = -0.02x + 8499.6. The code: package main import ( "encoding/csv" "fmt" "strconv" "strings" ) const ( iterations = 1000 learningRate = 0.0001 ) func computePrice(m, x, c