missing-data | 易学教程

R- create new dataframe variable from subset of two variables with missing data NA

阅读更多关于 R- create new dataframe variable from subset of two variables with missing data NA

问题 I have a simple example data frame with two data columns (data1 and data2) and two grouping variables (Measure 1 and 2). Measure 1 and 2 have missing data NA. d <- data.frame(Measure1 = 1:2, Measure2 = 3:4, data1 = 1:10, data2 = 11:20) d$Measure1[4]=NA d$Measure2[8]=NA d Measure1 Measure2 data1 data2 1 1 3 1 11 2 2 4 2 12 3 1 3 3 13 4 NA 4 4 14 5 1 3 5 15 6 2 4 6 16 7 1 3 7 17 8 2 NA 8 18 9 1 3 9 19 10 2 4 10 20 I want to create a new variable ( d$new ) that contains data1, but only for rows

Visualisation of missing-data occurrence frequency by using seaborn

阅读更多关于 Visualisation of missing-data occurrence frequency by using seaborn

问题 I'd like to create a 24x20 matrix(8 sections each has 60 cells or 6x10) for visualization of frequency of missing-data occurrence through cycles (=each 480-values ) in dataset via panda dataframe and plot it for each columns 'A' , 'B' , 'C' . So far I could map the create csv files and mapped the values in right way in matrix and plot it via sns.heatmap(df.isnull()) after changed the missing-data ( nan & inf ) into 0 or something like 0.01234 which has the least influence on data and in the

Replace missing values with mean (Weka)

阅读更多关于 Replace missing values with mean (Weka)

问题 in Weka there is a filter called "ReplaceMissingValues" that permit to replace all missing values in a dataset using the mean of each attribute. I'd like to replace missing values, for a certain attribute, using the mean of values that belong to a certain class. For example in a binary dataset I think that is more correct to replace a missing value for an attribute in record that belong to the positive class using the mean calculated with only the records that belong to the positive class. So

How to substitute several NA with values within the DF using if-else in R?

阅读更多关于 How to substitute several NA with values within the DF using if-else in R?

问题 thank you for your time. I have the following data (snippet). Its from longitudinal data, reformed to a wide-format-file of work status, each colum represents one month, each row an individual. Code: j1992_12 = c(1, 10, 1, 7, 1, 1) j1993_01 = c( 1, 1, 1, NA, 3, 1) j1993_02 = c( 1, 1, 1, NA, 3, 1) j1993_03 = c( 1, 8, 1, NA, 3, 1) j1993_04 = c( 1, 8, 1, NA, 3, 1) j1993_05 = c( 1, 8, 1, NA, 3, 1) j1993_06 = c( 1, 8, 1, NA, 3, 1) j1993_07 = c( 1, 8, 1, NA, 3, 1) j1993_08 = c( 1, 8, 1, NA, 3, 1)

Missing values for the data to be used in a Neural Network model for prediction

阅读更多关于 Missing values for the data to be used in a Neural Network model for prediction

问题 I currently have a lot of data that will be used to train a prediction neural network (gigabytes of weather data for major airports around the US). I have data for almost every day, but some airports have missing values in their data. For example, an airport might not have existed before 1995, so I have no data before then for that specific location. Also, some are missing whole years (one might span from 1990 to 2011, missing 2003). What can I do to train with these missing values without

'NaTType' object has no attribute 'days'

阅读更多关于 'NaTType' object has no attribute 'days'

问题 I have a column in my dataset which represents a date in ms and sometimes its values is nan (actually my columns is of type str and sometimes its valus is 'nan' ). I want to compute the epoch in days of this column. The problem is that when doing the difference of two dates: (pd.to_datetime('now') - pd.to_datetime(np.nan)).days if one is nan it is converted to NaT and the difference is of type NaTType which hasn't the attribute days . In my case I would like to have nan as a result. Other

Highcharts: Displaying Linechart with missing datapoints

阅读更多关于 Highcharts: Displaying Linechart with missing datapoints

问题 I am calculating the average-value of properties for each week of the year. And I want to display these information in a line chart (x-Axis is the week of year, y-Axis the average value and the different lines represent different properties). But for any given property I do not necessarily have a datapoint for each week of the year. If I do not have such a datapoint I want my line for this property to interpolate between the datapoints I have. Anyone else run into a similiar issue? 回答1:

R: fill missing value with prior values [duplicate]

阅读更多关于 R: fill missing value with prior values [duplicate]

问题 This question already has answers here : Replacing NAs with latest non-NA value (15 answers) Closed 2 years ago . I have a dataframe that looks like this: d <- data.frame(county = c("Abilene", rep(NA, 5), "Cook", rep(NA, 4), "Blah", NA, "Allegheny", rep(NA, 3))) county 1 Abilene 2 <NA> 3 <NA> 4 <NA> 5 <NA> 6 <NA> 7 Cook 8 <NA> 9 <NA> 10 <NA> 11 <NA> 12 Blah 13 <NA> 14 Allegheny 15 <NA> 16 <NA> 17 <NA> I want to fill in the <NA> with the value of the previous non-missing county name. In other

How Can I Make Sure All My .CSV Data Gets Imported as NA instead of Blank in R?

阅读更多关于 How Can I Make Sure All My .CSV Data Gets Imported as NA instead of Blank in R?

问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 5 years ago . In my dataset, I'm using have four assessments I'm trying to predict: 1 [Good] to 4 [Bad]. My model seems to be working using the polr function to predict values using ordered logistic regression -- though it's giving me the 'warning message': In cbind(race, partisanship, sex, age) : number of rows of result is not a multiple of vector length (arg 4) , because there are some

Crosstab query: Getting Null Data for Missing Data from Access DB

阅读更多关于 Crosstab query: Getting Null Data for Missing Data from Access DB

问题 I have data in Access Database which contains data for multiple days. But it sometime have missing data for some dates. In example, I have data for myDate Location Price 11/1/2013 South 10 11/1/2013 West 20 11/1/2013 East 10 11/2/2013 South 10 11/2/2013 West 20 11/2/2013 East 10 11/4/2013 South 10 <---- 11/3/2013 Data Missing 11/4/2013 West 30 11/4/2013 East 10 The way I tried to solve it was to find missing date in Access Database, and filled it with Null value using calender table. myDate