I have some data I\'m trying to organize into a DataFrame
in Pandas
. I was trying to make each row a Series
and append it to the Da
Try using this command. See the example given below:
df.loc[len(df)] = ['Product 9',99,9.99,8.88,1.11]
df
DataFrame.append does not modify the DataFrame in place. You need to do df = df.append(...)
if you want to reassign it back to the original variable.
Convert the series to a dataframe and transpose it, then append normally.
srs = srs.to_frame().T
df = df.append(srs)
This would work as well:
df = pd.DataFrame()
new_line = pd.Series({'A2M': 4.059, 'A2ML1': 4.28}, name='HCC1419')
df = df.append(new_line, ignore_index=False)
The name
in the Series will be the index in the dataframe. ignore_index=False
is the important flag in this case.
Something like this could work...
mydf.loc['newindex'] = myseries
Here is an example where I used it...
stats = df[['bp_prob', 'ICD9_prob', 'meds_prob', 'regex_prob']].describe()
stats
Out[32]:
bp_prob ICD9_prob meds_prob regex_prob
count 171.000000 171.000000 171.000000 171.000000
mean 0.179946 0.059071 0.067020 0.126812
std 0.271546 0.142681 0.152560 0.207014
min 0.000000 0.000000 0.000000 0.000000
25% 0.000000 0.000000 0.000000 0.000000
50% 0.000000 0.000000 0.000000 0.013116
75% 0.309019 0.065248 0.066667 0.192954
max 1.000000 1.000000 1.000000 1.000000
medians = df[['bp_prob', 'ICD9_prob', 'meds_prob', 'regex_prob']].median()
stats.loc['median'] = medians
stats
Out[36]:
bp_prob ICD9_prob meds_prob regex_prob
count 171.000000 171.000000 171.000000 171.000000
mean 0.179946 0.059071 0.067020 0.126812
std 0.271546 0.142681 0.152560 0.207014
min 0.000000 0.000000 0.000000 0.000000
25% 0.000000 0.000000 0.000000 0.000000
50% 0.000000 0.000000 0.000000 0.013116
75% 0.309019 0.065248 0.066667 0.192954
max 1.000000 1.000000 1.000000 1.000000
median 0.000000 0.000000 0.000000 0.013116
Maybe an easier way would be to add the pandas.Series
into the pandas.DataFrame
with ignore_index=True
argument to DataFrame.append()
. Example -
DF = DataFrame()
for sample,data in D_sample_data.items():
SR_row = pd.Series(data.D_key_value)
DF = DF.append(SR_row,ignore_index=True)
Demo -
In [1]: import pandas as pd
In [2]: df = pd.DataFrame([[1,2],[3,4]],columns=['A','B'])
In [3]: df
Out[3]:
A B
0 1 2
1 3 4
In [5]: s = pd.Series([5,6],index=['A','B'])
In [6]: s
Out[6]:
A 5
B 6
dtype: int64
In [36]: df.append(s,ignore_index=True)
Out[36]:
A B
0 1 2
1 3 4
2 5 6
Another issue in your code is that DataFrame.append() is not in-place, it returns the appended dataframe, you would need to assign it back to your original dataframe for it to work. Example -
DF = DF.append(SR_row,ignore_index=True)
To preserve the labels, you can use your solution to include name for the series along with assigning the appended DataFrame back to DF
. Example -
DF = DataFrame()
for sample,data in D_sample_data.items():
SR_row = pd.Series(data.D_key_value,name=sample)
DF = DF.append(SR_row)
DF.head()