bayesian

pymc3 with custom likelihood function from kernel density estimation

时光总嘲笑我的痴心妄想 提交于 2019-12-10 10:18:54
问题 I'm trying to use pymc3 with a likelihood function derived from some observed data. This observed data doesn't fit any nice, standard distribution, so I want to define my own, based on these observations. One approach is to use kernel density estimation over the observations. This was possible in pymc2, but doesn't play nicely with the Theano variables in pymc3. In my code below I'm just generating some dummy data that is normally distributed. As my prior, I'm essentially assuming a uniform

Naive Bayes: the within-class variance in each feature of TRAINING must be positive

妖精的绣舞 提交于 2019-12-09 17:44:32
问题 When trying to fit Naive Bayes: training_data = sample; % target_class = K8; # train model nb = NaiveBayes.fit(training_data, target_class); # prediction y = nb.predict(cluster3); I get an error: ??? Error using ==> NaiveBayes.fit>gaussianFit at 535 The within-class variance in each feature of TRAINING must be positive. The within-class variance in feature 2 5 6 in class normal. are not positive. Error in ==> NaiveBayes.fit at 498 obj = gaussianFit(obj, training, gindex); Can anyone shed

NaiveBayes in R Cannot Predict - factor(0) Levels:

大城市里の小女人 提交于 2019-12-08 23:32:24
问题 I have a dataset looks like this: data.flu <- data.frame(chills = c(1,1,1,0,0,0,0,1), runnyNose = c(0,1,0,1,0,1,1,1), headache = c("M", "N", "S", "M", "N", "S", "S", "M"), fever = c(1,0,1,1,0,1,0,1), flu = c(0,1,1,1,0,1,0,1) ) > data.flu chills runnyNose headache fever flu 1 1 0 M 1 0 2 1 1 N 0 1 3 1 0 S 1 1 4 0 1 M 1 1 5 0 0 N 0 0 6 0 1 S 1 1 7 0 1 S 0 0 8 1 1 M 1 1 > str(data.flu) 'data.frame': 8 obs. of 5 variables: $ chills : num 1 1 1 0 0 0 0 1 $ runnyNose: num 0 1 0 1 0 1 1 1 $ headache

Multinomial distribution in PyMC

旧巷老猫 提交于 2019-12-08 21:42:36
I am a newbie to pymc. I have read the required stuff on github and was doing fine till I was stuck with this problem. I want to make a collection of multinomial random variables which I can later sample using mcmc. But the best I can do is rv = [ Multinomial("rv", count[i], p_d[i]) for i in xrange(0, len(count)) ] for i in rv: print i.value i.random() for i in rv: print i.value But it is of no good since I want to be able to call rv.value and rv.random() , otherwise I won't be able to sample from it. count is a list of non-ve integers each denoting value of n for that distribution eg. a

ChoiceModelR - Hierarchical Bayes Multinomial Logit Model

心已入冬 提交于 2019-12-08 13:55:26
I hope that some of you are a bit experienced with the R package ChoiceModelR by Sermas and Colias, to estimate a Hierarchical Bayes Multinomial Logit Model. Actually, I am quite a newbie on both R and Hierarchical Bayes. However, I tried to get some estimates by using the script provided by Sermas and Colias in the help file. I have a data set in the same structure as they use (ID, choice set, alternative, independent variables, and choice variable). I have four independent variables all of them binary coded as categorical variables, none of them restricted. I have eight choice sets with

sklearn GaussianNB - bad results, [nan] probabilities

两盒软妹~` 提交于 2019-12-08 05:23:15
问题 I'm doing some work on gender classification for a class. I've been using SVMLight with decent results, but I wanted to try some bayesian methods on my data as well. My dataset consists of text data, and I've done feature reduction to pare down the feature space to a more reasonable size for some of the bayesian methods. All of the instances are run through tf-idf and then normalized (through my own code). I grabbed the sklearn toolkit because it was easy to integrate with my current codebase

Extract and add to the data frame the values of sigma from a stan distributional linear model

狂风中的少年 提交于 2019-12-08 05:02:45
问题 Given the sample data sampleDT and the brms models brm.fit and brm.fit.distr below, I would like to: estimate, extract and add to the data frame the values of the standard deviations for each observation from the distributional model brm.fit.distr . I can do this using brm.fit , but my approach fails when I use brm.fit.distr . Sample data sampleDT<-structure(list(id = 1:10, N = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L), A = c(62L, 96L, 17L, 41L, 212L, 143L, 143L, 143L, 73L, 73L), B

PyMC3 - Differences in ways observations are passed to model -> difference in results?

依然范特西╮ 提交于 2019-12-08 04:50:03
问题 I'm trying to understand if there is any meaningful difference in the ways of passing data into a model - either aggregated or as single trials (note this will only be a sensical question for certain distributions e.g. Binomial). Predicting p for a yes/no trail, using a simple model with a Binomial distribution. What is the difference in the computation/results of the following models (if any)? I choose the two extremes, either passing in a single trail at once (reducing to Bernoulli) or

How to choose Gaussian basis functions hyperparameters for linear regression?

↘锁芯ラ 提交于 2019-12-08 01:26:28
问题 I'm quite new in machine learning environment, and I'm trying to understand properly some basis concept. My problem is the following: I have a set of data observation and the corresponding target values { x , t }. I'm trying to train a function with this data in order to predict the value of unobserved data and I'm trying to achieve this by using the maximum posterior (MAP) technique (and so Bayesian approach) with Gaussian basis function of the form: \{Phi}Gaussian_{j}(x)=exp((x−μ_{j})^2/2

Bayesian Linear Regression with PyMC3 and a large dataset - bracket nesting level exceeded maximum and slow performance

守給你的承諾、 提交于 2019-12-07 15:02:16
问题 I would like to use a Bayesian multivariate linear regression to estimate the strength of players in team sports (e.g. ice hockey, basketball or soccer). For that purpose, I create a matrix, X, containing the players as columns and the matches as rows. For each match the player entry is either 1 (player plays in the home team), -1 (player plays in the away team) or 0 (player does not take part in this game). The dependent variable Y is defined as the scoring differences for both teams in each