chi-squared

Python - Minimizing Chi-squared

≡放荡痞女 提交于 2019-12-04 11:27:04
I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. Unfortunately using the code below is not correctly minimizing the chisqfunc function. It is finding the minimum at the initial conditions, x0 , which is not correct. I have looked through the scipy.optimize documentation and tested minimizing other functions which has worked correctly. Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared? import numpy import scipy.optimize as opt filename = 'data.csv' data

Sklearn Chi2 For Feature Selection

风格不统一 提交于 2019-12-04 07:36:57
I'm learning about chi2 for feature selection and came across code like this However, my understanding of chi2 was that higher scores mean that the feature is more independent (and therefore less useful to the model) and so we would be interested in features with the lowest scores. However, using scikit learns SelectKBest , the selector returns the values with the highest chi2 scores. Is my understanding of using the chi2 test incorrect? Or does the chi2 score in sklearn produce something other than a chi2 statistic? See code below for what I mean (mostly copied from above link except for the

Chisquare test give wrong result. Should I reject proposed distribution?

泄露秘密 提交于 2019-12-04 05:50:28
问题 I want to fit poission distribution on my data points and want to decide based on chisquare test that should I accept or reject this proposed distribution. I only used 10 observations. Here is my code #Fitting function: def Poisson_fit(x,a): return (a*np.exp(-x)) #Code hist, bins= np.histogram(x, bins=10, density=True) print("hist: ",hist) #hist: [5.62657158e-01, 5.14254073e-01, 2.03161280e-01, 5.84898068e-02, 1.35995217e-02,2.67094169e-03,4.39345778e-04,6.59603327e-05,1.01518320e-05, 1

SQL Query for Chi-SQUARE TEST [duplicate]

China☆狼群 提交于 2019-12-03 22:00:08
This question already has an answer here: SQL Server Query to find CHI-SQUARE Values (Not Working) 1 answer I am trying to find the CHI-SQUARE TEST on the following set of data in the table. I am trying my this Query to find the CHI-SQUARE TEST: SELECT sessionnumber, sessioncount, timespent, (dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected FROM (SELECT sessionnumber, SUM(cast(cnt as bigint)) as cnt FROM d3 GROUP BY sessionnumber) dim1 CROSS JOIN (SELECT sessioncount, SUM(cast(cnt as bigint)) as cnt FROM d3 GROUP BY sessioncount) dim2 CROSS JOIN (SELECT timespent, SUM(cast

Feature selection for multilabel classification (scikit-learn)

…衆ロ難τιáo~ 提交于 2019-12-03 20:30:36
I'm trying to do a feature selection by chi-square method in scikit-learn (sklearn.feature_selection.SelectKBest). When I'm trying to apply this to a multilabel problem, I get this warning: UserWarning: Duplicate scores. Result may depend on feature ordering.There are probably duplicate features, or you used a classification score for a regression task. warn("Duplicate scores. Result may depend on feature ordering." Why is it appearning and how to properly apply feature selection is this case? The code warns you that arbitrary tie-breaking may need to be performed because some features have

Chi-Squared Probability Function in C++

a 夏天 提交于 2019-12-03 17:12:39
The following code of mine computes the confidence interval using Chi-square's 'quantile' and probability function from Boost. I am trying to implement this function as to avoid dependency to Boost. Is there any resource where can I find such implementation? #include <boost/math/distributions/chi_squared.hpp> #include <boost/cstdint.hpp> using namespace std; using boost::math::chi_squared; using boost::math::quantile; vector <double> ConfidenceInterval(double x) { vector <double> ConfInts; // x is an estimated value in which // we want to derive the confidence interval. chi_squared distl(2);

Chi-squared test of independence on all combinations of columns in a dataframe in R

两盒软妹~` 提交于 2019-12-03 16:36:05
this is my first time posting here and I hope this is all in the right place. I have been using R for basic statistical analysis for some time, but haven't really used it for anything computationally challenging and I'm very much a beginner in the programming/ data manipulation side of R. I have presence/absence (binary) data on 72 plant species in 323 plots in a single catchment. The dataframe is 323 rows, each representing a plot, with 72 columns, each representing a species. This is a sample of the first 4 columns (some row numbers are missing because the 323 plots are a subset of a larger

Python scipy chisquare returns different values than R chisquare

假如想象 提交于 2019-12-03 15:52:34
I am trying to use scipy.stats.chisquare . I have built a toy example: In [1]: import scipy.stats as sps In [2]: import numpy as np In [3]: sps.chisquare(np.array([38,27,23,17,11,4]), np.array([98, 100, 80, 85,60,23])) Out[11]: (240.74951271813072, 5.302429887719704e-50) The same example in R returns: > chisq.test(matrix(c(38,27,23,17,11,4,98,100,80,85,60,23), ncol=2)) Pearson's Chi-squared test data: matrix(c(38, 27, 23, 17, 11, 4, 98, 100, 80, 85, 60, 23), ncol = 2) X-squared = 7.0762, df = 5, p-value = 0.215 What am I doing wrong? Thanks For this chisq.test call python equivalent is chi2

Chi-squared goodness of fit test in R

假如想象 提交于 2019-12-03 12:03:40
问题 I have a vector of observed values and also a vector of values calculated with model: actual <- c(1411,439,214,100,62,38,29,64) expected <- c(1425.3,399.5,201.6,116.9,72.2,46.3,30.4,64.8) Now I'm using the Chi-squared goodness of fit test to see how well my model performs. I wrote the following: chisq.test(expected,actual) but it doesn't work. Can you help me with this? 回答1: X^2 = 10.2 at 7 degrees of freedom will give you a p ~ 0.18 . > 1-pchisq(10.2, df = 7) [1] 0.1775201 You should pass on

P-value from Chi sq test statistic in Python

浪子不回头ぞ 提交于 2019-12-03 04:11:52
问题 I have computed a test statistic that is distributed as a chi square with 1 degree of freedom, and want to find out what P-value this corresponds to using python. I'm a python and maths/stats newbie so I think what I want here is the probability denisty function for the chi2 distribution from SciPy. However, when I use this like so: from scipy import stats stats.chi2.pdf(3.84 , 1) 0.029846 However some googling and talking to some colleagues who know maths but not python have said it should