bayesian | 易学教程

pymc3 with custom likelihood function from kernel density estimation

阅读更多关于 pymc3 with custom likelihood function from kernel density estimation

问题 I'm trying to use pymc3 with a likelihood function derived from some observed data. This observed data doesn't fit any nice, standard distribution, so I want to define my own, based on these observations. One approach is to use kernel density estimation over the observations. This was possible in pymc2, but doesn't play nicely with the Theano variables in pymc3. In my code below I'm just generating some dummy data that is normally distributed. As my prior, I'm essentially assuming a uniform

Naive Bayes: the within-class variance in each feature of TRAINING must be positive

阅读更多关于 Naive Bayes: the within-class variance in each feature of TRAINING must be positive

问题 When trying to fit Naive Bayes: training_data = sample; % target_class = K8; # train model nb = NaiveBayes.fit(training_data, target_class); # prediction y = nb.predict(cluster3); I get an error: ??? Error using ==> NaiveBayes.fit>gaussianFit at 535 The within-class variance in each feature of TRAINING must be positive. The within-class variance in feature 2 5 6 in class normal. are not positive. Error in ==> NaiveBayes.fit at 498 obj = gaussianFit(obj, training, gindex); Can anyone shed

NaiveBayes in R Cannot Predict - factor(0) Levels:

阅读更多关于 NaiveBayes in R Cannot Predict - factor(0) Levels:

问题 I have a dataset looks like this: data.flu <- data.frame(chills = c(1,1,1,0,0,0,0,1), runnyNose = c(0,1,0,1,0,1,1,1), headache = c("M", "N", "S", "M", "N", "S", "S", "M"), fever = c(1,0,1,1,0,1,0,1), flu = c(0,1,1,1,0,1,0,1) ) > data.flu chills runnyNose headache fever flu 1 1 0 M 1 0 2 1 1 N 0 1 3 1 0 S 1 1 4 0 1 M 1 1 5 0 0 N 0 0 6 0 1 S 1 1 7 0 1 S 0 0 8 1 1 M 1 1 > str(data.flu) 'data.frame': 8 obs. of 5 variables: $ chills : num 1 1 1 0 0 0 0 1 $ runnyNose: num 0 1 0 1 0 1 1 1 $ headache

Multinomial distribution in PyMC

阅读更多关于 Multinomial distribution in PyMC

I am a newbie to pymc. I have read the required stuff on github and was doing fine till I was stuck with this problem. I want to make a collection of multinomial random variables which I can later sample using mcmc. But the best I can do is rv = [ Multinomial("rv", count[i], p_d[i]) for i in xrange(0, len(count)) ] for i in rv: print i.value i.random() for i in rv: print i.value But it is of no good since I want to be able to call rv.value and rv.random() , otherwise I won't be able to sample from it. count is a list of non-ve integers each denoting value of n for that distribution eg. a

ChoiceModelR - Hierarchical Bayes Multinomial Logit Model

阅读更多关于 ChoiceModelR - Hierarchical Bayes Multinomial Logit Model

I hope that some of you are a bit experienced with the R package ChoiceModelR by Sermas and Colias, to estimate a Hierarchical Bayes Multinomial Logit Model. Actually, I am quite a newbie on both R and Hierarchical Bayes. However, I tried to get some estimates by using the script provided by Sermas and Colias in the help file. I have a data set in the same structure as they use (ID, choice set, alternative, independent variables, and choice variable). I have four independent variables all of them binary coded as categorical variables, none of them restricted. I have eight choice sets with

sklearn GaussianNB - bad results, [nan] probabilities

阅读更多关于 sklearn GaussianNB - bad results, [nan] probabilities

问题 I'm doing some work on gender classification for a class. I've been using SVMLight with decent results, but I wanted to try some bayesian methods on my data as well. My dataset consists of text data, and I've done feature reduction to pare down the feature space to a more reasonable size for some of the bayesian methods. All of the instances are run through tf-idf and then normalized (through my own code). I grabbed the sklearn toolkit because it was easy to integrate with my current codebase

Extract and add to the data frame the values of sigma from a stan distributional linear model

阅读更多关于 Extract and add to the data frame the values of sigma from a stan distributional linear model

问题 Given the sample data sampleDT and the brms models brm.fit and brm.fit.distr below, I would like to: estimate, extract and add to the data frame the values of the standard deviations for each observation from the distributional model brm.fit.distr . I can do this using brm.fit , but my approach fails when I use brm.fit.distr . Sample data sampleDT<-structure(list(id = 1:10, N = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L), A = c(62L, 96L, 17L, 41L, 212L, 143L, 143L, 143L, 73L, 73L), B

PyMC3 - Differences in ways observations are passed to model -> difference in results?

阅读更多关于 PyMC3 - Differences in ways observations are passed to model -> difference in results?

问题 I'm trying to understand if there is any meaningful difference in the ways of passing data into a model - either aggregated or as single trials (note this will only be a sensical question for certain distributions e.g. Binomial). Predicting p for a yes/no trail, using a simple model with a Binomial distribution. What is the difference in the computation/results of the following models (if any)? I choose the two extremes, either passing in a single trail at once (reducing to Bernoulli) or

How to choose Gaussian basis functions hyperparameters for linear regression?

阅读更多关于 How to choose Gaussian basis functions hyperparameters for linear regression?

问题 I'm quite new in machine learning environment, and I'm trying to understand properly some basis concept. My problem is the following: I have a set of data observation and the corresponding target values { x , t }. I'm trying to train a function with this data in order to predict the value of unobserved data and I'm trying to achieve this by using the maximum posterior (MAP) technique (and so Bayesian approach) with Gaussian basis function of the form: \{Phi}Gaussian_{j}(x)=exp((x−μ_{j})^2/2

Bayesian Linear Regression with PyMC3 and a large dataset - bracket nesting level exceeded maximum and slow performance

阅读更多关于 Bayesian Linear Regression with PyMC3 and a large dataset - bracket nesting level exceeded maximum and slow performance

问题 I would like to use a Bayesian multivariate linear regression to estimate the strength of players in team sports (e.g. ice hockey, basketball or soccer). For that purpose, I create a matrix, X, containing the players as columns and the matches as rows. For each match the player entry is either 1 (player plays in the home team), -1 (player plays in the away team) or 0 (player does not take part in this game). The dependent variable Y is defined as the scoring differences for both teams in each