missing-data

Zeros as missing cases in R

╄→гoц情女王★ 提交于 2019-12-11 07:42:29
问题 I have a csv with millions of cases that look like this: Case_1,11,17481,172,4436,8,4436 Case_2,11,1221,680,55200,1776,55200 Case_3,16,6647,6449,579967,1,579967 Case_4,22,0,0,0,0,0 In this case, Case_4 is missing data, since it has a bunch of zeros in it (there are hundreds of these in the file). I'm very new to R, and I was wondering if there is an efficient way of deleting these kinds of missing data from the file? Thanks. 回答1: Use the na.strings argument when reading in your file. df <-

Julia: creating a method for Any vector with missing values

血红的双手。 提交于 2019-12-11 07:05:17
问题 I would like to create a function that deals with missing values. However, when I tried to specify the missing type Array{Missing, 1}, it errors. function f(x::Array{<:Number, 1}) # do something complicated println("no missings.") println(sum(x)) end function f(x::Array{Missing, 1}) x = collect(skipmissing(x)) # do something complicated println("removed missings.") f(x) end f([2, 3, 5]) f([2, 3, 5, missing]) I understand that my type is not Missing but Array{Union{Missing, Int64},1} When I

How do I detect and re-insert missing data?

ε祈祈猫儿з 提交于 2019-12-11 06:07:29
问题 I have a missing row in a data table which describes a function from time , sid , and s.c to count : > dates.dt[1001:1011] sid s.c count time 1: missing CLICK 104192 2013-05-25 10:00:00 2: missing SHARE 7694 2013-05-25 10:00:00 3: present CLICK 99573 2013-05-25 10:00:00 4: present SHARE 89302 2013-05-25 10:00:00 5: missing CLICK 28 2013-05-25 11:00:00 6: present CLICK 25 2013-05-25 11:00:00 7: present SHARE 15 2013-05-25 11:00:00 8: missing CLICK 104544 2013-05-25 12:00:00 9: missing SHARE

Is it possible to get plot from panda dataframe includes missing data by Heatmap with especial color?

不想你离开。 提交于 2019-12-11 05:58:54
问题 I was wondering if I can get all plots of columns in panda dataframe in one-window via heatmap in 24x20 self-made matrix-model-square which I designed to map every 480 values of each column(which means 1-cycle) by mapping them inside of it through all cycles. The challenging point is I want to show missing data by using especial color which is out of color range of colormap cmap ='coolwarm' I already tried by using df = df.replace([np.inf, -np.inf], np.nan) make sure that all inf convert to

How to replace NAs with row means if proportion of row-wise NAs is below a certain threshold?

天涯浪子 提交于 2019-12-11 05:58:44
问题 Apologies for the somewhat cumbersome question, but I am currently working on a mental health study. For one of the mental health screening tools there are 15 variables, each of which can have values of 0-3. The total score for each row/participant is then assigned by taking the sum of these 15 variables. The documentation for this tool states that if more than 20% of the values for a particular row/participant are missing, the total score should be taken as missing also, however if fewer

Missing params in Ajax Post request in Laravel

守給你的承諾、 提交于 2019-12-11 04:53:46
问题 I am trying to make an Ajax post request and pass params to use them in a query, but my params are always empty. Here is my code: $.ajaxSetup({ headers: { 'X-CSRF-TOKEN': $('meta[name="csrf-token"]').attr('content') } }); function searchPatient(){ var params = { 'name' : $("#input-search-name").val(), 'lastname' : $("#input-search-lastname").val() } console.log($('meta[name="csrf-token"]').attr('content')); $.ajax({ data : params, url : '{{ route("searchPatient") }}', contentType:

Progression of non-missing values that have missing values in-between

我是研究僧i 提交于 2019-12-11 04:19:53
问题 To continue on a previous topic: Finding non-missing values between missing values I would like to also find whether the value before the missing value is smaller, equal to or larger than the one after the missing. To use the same example from before: df = structure(list(FirstYStage = c(NA, 3.2, 3.1, NA, NA, 2, 1, 3.2, 3.1, 1, 2, 5, 2, NA, NA, NA, NA, 2, 3.1, 1), SecondYStage = c(NA, 3.1, 3.1, NA, NA, 2, 1, 4, 3.1, 1, NA, 5, 3.1, 3.2, 2, 3.1, NA, 2, 3.1, 1), ThirdYStage = c(NA, NA, 3.1, NA,

How to get measures of model fit (AIC, F-statistics) in zelig for multiply imputed data?

三世轮回 提交于 2019-12-11 03:45:08
问题 Following up on an earlier post, I am interested in learning how to get the usual measures of the relative quality of a statistical model in zelig for regression using multiply imputed data (created with Amelia). require(Zelig) require(Amelia) data(freetrade) #Imputation of missing data a.out <- amelia(freetrade, m=5, ts="year", cs="country") # Regression model z.out <- zelig(polity~tariff+gdp.pc, model="ls", data=a.out$imputations) summary(z.out) Model: ls Number of multiply imputed data

Pandas: filling missing values iterating through a groupby object

为君一笑 提交于 2019-12-11 02:42:02
问题 I have the folowing dataset: d = {'player': ['1', '1', '1', '1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3'], 'session': ['a', 'a', 'b', np.nan, 'b', 'c', 'c', 'c', 'c', 'd', 'd', 'e', 'e', np.nan, 'e', 'f', 'f', 'g', np.nan, 'g'], 'date': ['2018-01-01 00:19:05', '2018-01-01 00:21:07', '2018-01-01 00:22:07', '2018-01-01 00:22:15','2018-01-01 00:25:09', '2018-01-01 00:25:11', '2018-01-01 00:27:28', '2018-01-01 00:29:29', '2018-01-01 00:30:35', '2018-01-01 00

back fill missing data with a label for a window of a time

早过忘川 提交于 2019-12-11 00:42:12
问题 I want to backfill each column based on time (1 day ,2 day) with different label. here is the code: from datetime import datetime, timedelta import pandas as pd import numpy as np import random np.random.seed(11) date_today = datetime.now() ndays = 15 df = pd.DataFrame({'date': [date_today + timedelta(days=x) for x in range(ndays)], 'test': pd.Series(np.random.randn(ndays)), 'test2':pd.Series(np.random.randn(ndays))}) df = df.set_index('date') df = df.mask(np.random.random(df.shape) < .7)