fillna

Pandas fillna throws ValueError: fill value must be in categories

不打扰是莪最后的温柔 提交于 2020-07-04 13:11:57
问题 Discription: both features are in categorical dtypes. and i used this code in a different kernal of same dateset was working fine, the only difference is the features are in flote64. later i have converted these feature dtypes into Categorical because all the features in the dataset represents categories. Below is the code: AM_train['product_category_2'].fillna('Unknown', inplace =True) AM_train['city_development_index'].fillna('Missing', inplace =True) 回答1: Use Series.cat.add_categories for

Pandas fillna throws ValueError: fill value must be in categories

独自空忆成欢 提交于 2020-07-04 13:07:19
问题 Discription: both features are in categorical dtypes. and i used this code in a different kernal of same dateset was working fine, the only difference is the features are in flote64. later i have converted these feature dtypes into Categorical because all the features in the dataset represents categories. Below is the code: AM_train['product_category_2'].fillna('Unknown', inplace =True) AM_train['city_development_index'].fillna('Missing', inplace =True) 回答1: Use Series.cat.add_categories for

Pandas fillna with an incremented value

荒凉一梦 提交于 2020-06-17 04:58:26
问题 I have a dataframe with a column of sequential but not adjacent numbers and missing values. I'd like to use the fillna function to fill in the missing values with an incremented value from the previous non-missing row. Here's a simplified table: index my_counter 0 1 1 2 2 NaN 3 3 4 NaN 5 NaN 6 8 I'd like to fill in my_counter as such: index my_counter 0 1 1 2 2 2.1 3 3 4 3.1 5 3.2 6 8 How can I accomplish this task? 回答1: IIUC ffill with groupby cumcount df.my_counter.ffill()+df.groupby(df.my

specifying a limit for fillna has not been implemented yet

删除回忆录丶 提交于 2020-01-06 05:25:15
问题 I want to implement a fillna method over a pandas dataframes with the method='bfill' and a limit labeled_features = final_feat.merge(failures, on=['datetime', 'machineID'], how='left') print(type(labeled_features)) labeled_features = labeled_features.bfill(limit=7) # fill backward up to 24h labeled_features = labeled_features.fillna('none') labeled_features.head() but I have the following error <class 'pandas.core.frame.DataFrame'> -------------------------------------------------------------

variable fillna() in each column

主宰稳场 提交于 2019-12-23 21:30:08
问题 For starters, here is some artificial data fitting my problem: df = pd.DataFrame(np.random.randint(0, 100, size=(vsize, 10)), columns = ["col_{}".format(x) for x in range(10)], index = range(0, vsize * 3, 3)) df_2 = pd.DataFrame(np.random.randint(0,100,size=(vsize, 10)), columns = ["col_{}".format(x) for x in range(10, 20, 1)], index = range(0, vsize * 2, 2)) df = df.merge(df_2, left_index = True, right_index = True, how = 'outer') df_tar = pd.DataFrame({"tar_1": [np.random.randint(0, 2) for

need to fill the NA values with the past three values before na values in python

让人想犯罪 __ 提交于 2019-12-13 02:14:22
问题 need to fill the NA values with the past three values mean of that NA this is my dataset RECEIPT_MONTH_YEAR NET_SALES 0 2014-01-01 818817.20 1 2014-02-01 362377.20 2 2014-03-01 374644.60 3 2014-04-01 NA 4 2014-05-01 NA 5 2014-06-01 NA 6 2014-07-01 NA 7 2014-08-01 46382.50 8 2014-09-01 55933.70 9 2014-10-01 292303.40 10 2014-10-01 382928.60 回答1: is this dataset a .csv file or a dataframe. This NA is a 'NaN' or a string ? import pandas as pd import numpy as np df=pd.read_csv('your dataset',sep=

filling NAN and converting to int pandas

不打扰是莪最后的温柔 提交于 2019-12-11 17:35:31
问题 I have a dataframe of integers. Preview (starts from 3 due to first 3 rows removal): The original data in the 'pixel1' column is int , but the NAN there forced it to float . I tried to fix it with: X_train.fillna(method='ffill', inplace=True) X_train = X_train.astype(int) print(X_train.head()) that results in: can I get the datatype of the value the fillna is using? is there a better way to do so? (better = to skip the astype step, as the data is int originally - I planted the NAN in the file

Filling Missing values Pandas Dataframe by specific value

假如想象 提交于 2019-12-11 08:45:54
问题 I have a dataset and I want to fill the missing data in the column 'value' with bfill with adding a string to it. Here is to code that I have: import pandas as pd import numpy as np df = pd.DataFrame( { 'category': ['X', 'X', 'X', 'X', 'X', 'X', 'Y', 'Y', 'Y'], 'name': ['A','A', 'B','B','B','B', 'C','C','C'], 'other_value': [10, np.nan, np.nan, 20, 30, 10, 30, np.nan, 30], 'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], } ) print(df) def fillValue(g): gNotNull = g.dropna() wtAvg = str

Pandas fillna method does not work inplace

心已入冬 提交于 2019-12-11 04:26:20
问题 I have a dataframe problem_data which has NaN values in some cells. I ran the following code. problem_data[problem_data['level_type'] == 5.0] It resulted to this : problem_id level_type points tags 5 prob_1479 5.0 NaN NaN 31 prob_2092 5.0 NaN NaN 38 prob_4395 5.0 NaN combinatorics,constructive algorithms,dfs 43 prob_5653 5.0 NaN NaN 48 prob_2735 5.0 2750.0 NaN 52 prob_1054 5.0 2000.0 combinatorics,dp 64 prob_2610 5.0 NaN NaN 65 prob_1649 5.0 NaN NaN 70 prob_4675 5.0 NaN dp,games 74 prob_445 5

pandas replace only part of a column

痴心易碎 提交于 2019-12-10 21:04:46
问题 Here is my input: import pandas as pd import numpy as np list1 = [10,79,6,38,4,557,12,220,46,22,45,22] list2 = [4,3,23,6,234,47,312,2,426,42,435,23] df = pd.DataFrame({'A' : list1, 'B' : list2}, columns = ['A', 'B']) df['C'] = np.where (df['A'] > df['B'].shift(-2), 1, np.nan) print (df) that produces this output: A B C 0 10 4 NaN 1 79 3 1.0 2 6 23 NaN 3 38 6 NaN 4 4 234 NaN 5 557 47 1.0 6 12 312 NaN 7 220 2 1.0 8 46 426 NaN 9 22 42 NaN 10 45 435 NaN 11 22 23 NaN What I need to do is to change