bayesian

pymc with observations on multiple variables

左心房为你撑大大i 提交于 2021-01-28 08:25:50
问题 I'm using an example of linear regression from bayesian methods for hackers but having trouble expanding it to my usage. I have observations on a random variable, an assumed distribution on that random variable, and finally another assumed distribution on that random variable for which I have observations. How I have tried to model it is with intermediate distributions on a and b , but it complains Wrong number of dimensions: expected 0, got 1 with shape (788,). To describe the actual model,

Gaussian Process Regression: standard deviation meaning

不想你离开。 提交于 2021-01-28 05:58:20
问题 In the following code about the Gaussian Process Regression (GPR): from sklearn.datasets import make_friedman2 from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import DotProduct, WhiteKernel X, y = make_friedman2(n_samples=500, noise=0, random_state=0) kernel = DotProduct() + WhiteKernel() gpr = GaussianProcessRegressor(kernel=kernel, random_state=0).fit(X, y) print gpr.score(X, y) print gpr.predict(X[:2,:], return_std=True) What is the

Gaussian Process Regression: standard deviation meaning

瘦欲@ 提交于 2021-01-28 04:51:40
问题 In the following code about the Gaussian Process Regression (GPR): from sklearn.datasets import make_friedman2 from sklearn.gaussian_process import GaussianProcessRegressor from sklearn.gaussian_process.kernels import DotProduct, WhiteKernel X, y = make_friedman2(n_samples=500, noise=0, random_state=0) kernel = DotProduct() + WhiteKernel() gpr = GaussianProcessRegressor(kernel=kernel, random_state=0).fit(X, y) print gpr.score(X, y) print gpr.predict(X[:2,:], return_std=True) What is the

四两拨千斤!深度主动学习综述2020

只愿长相守 提交于 2021-01-20 11:25:04
点击上方 “ 机器学习与生成对抗网络 ”,关注"星标" 获取有趣、好玩的前沿干货! 推荐阅读: 弱水三千,只取你标!AL(主动学习)结合GAN如何? 本文转载自知乎,为最近新出的论文 A Survey of Deep Active Learning 中文版介绍,原文作者调研了189 篇文献综述了深度主动学习的进展。文章较长,建议先收藏再阅读。 https://zhuanlan.zhihu.com/p/218011458 论文地址: https://arxiv.org/pdf/2009.00236.pdf A bstract 主动学习试图通过标记最少量的样本使得模型的性能收益最大化。而深度学习则对数据比较贪婪,需要大量的数据供给来优化海量的参数,从而使得模型学会如何提取高质量的特征。近年来,由于互联网技术的快速发展,使得我们处在一个信息洪流的时代,我们拥有海量的未标记数据。 借此,深度学习引起了研究人员的强烈兴趣,并且得到了快速的发展。和深度学习相比,研究人员对于主动学习的研究兴趣相对较低。这主要是由于在深度学习兴起之前,传统的机器学习所需要的标注样本相对较少。因此,早期的主动学习很难体现出应有的价值。尽管深度学习已经在各个领域取得了突破性进展,但是这绝大部分的成功都要归功于现有的大量标注数据集的公开。 然而,大量高质量的标注数据集的获取需要消耗大量的人力

数据集:人群行为识别数据库总结

偶尔善良 提交于 2021-01-07 07:31:14
参考论文:人群异常识别技术研究进展_魏永超 数据是人群行为识别研究的基础, 为了更加方便开展相关研究工作, 陆续有研究机构采集人群异常行 为数据, 构建了相关数据库并进行公开, 从而一定程度推动了人群行为研究. 这些数据库为行为识别的研 究提供了重要参考依据. 下面将对代表性的人群行为数据库的进行概括。 (1)USCD(University of California, San Diego)异常检测数据库[32]. 数据由加州大学圣地亚哥分校创建, 数据是通过安装在一定高度、俯视人行道的摄像机,采集自然状态下发生的异常行为. 异常行为包含两类: 非人实体闯入和人行为异常. 异常种类包括骑自行车、滑冰、小推车、行人横穿人行道、侵入草地等, 同 时也记录人在轮椅上的几个实例. 数据由 98 个视频组成, 被分成 2 不同的场景的子集, 每个场景录制的视 频录像被分成约 200 帧的各个片段. 该数据库主要针对是人群中个体行为的识别研究. (2) UMN(University of Minnesota)数据库[33]. 明尼苏达州大学创建的一个数据库, 由 11 个视频组成, 包 含了正常和异常视频. 每个视频起始部分是正常行为,随后为异常行为视频序列. 人群异常行为主要包括:人群单方向跑动、人群四散等. 该视频数据库采集的视频人为安排的异常行为. 该数据库针对的整体人群行为识别. (3

Production of a BSTS Mean Absolute Percentage Error (MAPE) Plot from a Bayesian Time Series Analysis with MCMC using ggplot() and bsts() packages

岁酱吖の 提交于 2021-01-07 03:11:54
问题 Problem: I have a data frame called FID (see below) that contains two columns for Year & Month, and Sighting_Frequency (counts of birds). The data frame contains 3 years of observations between 2015-2017 , indicating I have 36 months of data. I have run a Bayesian time series analysis with MCMC using the bsts() function in the bsts package (see the R-code below) by following the tutorial below. I want to produce a holdout Mean Absolute Percentage Error (MAPE) Plot as seen in the diagram below

Exact Inference in Graphical Models

不羁的心 提交于 2021-01-05 23:51:36
独立(Independence) 统计独立(Statistical Independence) 两个随机变量X,Y统计独立的条件是当且仅当其联合概率分布等于边际概率分布之积: $$ X \perp Y \leftrightarrow P(X,Y)=P(Y) P(Y) $$ 思考:假设 $X \perp Y$,$Y \perp Z$,那么 $X$ 和 $Y$ 有没有独立关系呢? 举例:爸吃饭,奥巴马吃饭,妈吃饭 条件独立(Conditional Independence) 两个随机变量X,Y在Z的条件下独立的条件是当且仅当其条件联合概率分布等于条件边际概率分布之积: $$ X \perp Y | Z \leftrightarrow P(X, Y|Z)=P(X|Z)P(Y|Z) $$ 仅知道 Z 就能够决定 X ,此时 Y 与 X (条件)独立 概率图模型(Probabilistic Graphical Models) 预备知识 对于 D 个 K 项随机变量:$X_1, ..., X_D \quad X_i \in \left { 1, ..., K \right }$ 边际(Marginal): $$ P(X_{1:i-1,\ i+1:D})=\sum_{X_i}P(X_{1:D}) $$ 链式法则(Chain Rule)求联合分布 $$ P(X_{1:D})=P(X_1)\prod

Measuring uncertainty using MC Dropout on pytorch

穿精又带淫゛_ 提交于 2020-12-06 16:01:47
问题 I am trying to implement Bayesian CNN using Mc Dropout on Pytorch, the main idea is that by applying dropout at test time and running over many forward passes , you get predictions from a variety of different models. I’ve found an application of the Mc Dropout and I really did not get how they applied this method and how exactly they did choose the correct prediction from the list of predictions here is the code def mcdropout_test(model): model.train() test_loss = 0 correct = 0 T = 100 for

Updating model on PyMC3 with new observed data

浪子不回头ぞ 提交于 2020-12-04 09:01:59
问题 I have measured the diameter of 80 fruits last year, and after checking what is the best distribution of the values, I've created a PyMC3 model with Model() as diam_model: mu = Normal('mu',mu=57,sd=5.42) sigma = Uniform('sigma',0,10) after, as far as I understand, I've "trained" the model with my prior data (the 80 values) with diam_model: dist = Normal('dist',mu=mu,sd=sigma, observed=prior_data.values) with diam_model: samples=fit().sample(1000) then I used the plot_posterior of the samples

Updating model on PyMC3 with new observed data

末鹿安然 提交于 2020-12-04 09:00:34
问题 I have measured the diameter of 80 fruits last year, and after checking what is the best distribution of the values, I've created a PyMC3 model with Model() as diam_model: mu = Normal('mu',mu=57,sd=5.42) sigma = Uniform('sigma',0,10) after, as far as I understand, I've "trained" the model with my prior data (the 80 values) with diam_model: dist = Normal('dist',mu=mu,sd=sigma, observed=prior_data.values) with diam_model: samples=fit().sample(1000) then I used the plot_posterior of the samples