I\'ve created a Pandas DataFrame
df = DataFrame(index=[\'A\',\'B\',\'C\'], columns=[\'x\',\'y\'])
and got this
x y A NaN
The recommended way (according to the maintainers) to set a value is:
df.ix['x','C']=10
Using 'chained indexing' (df['x']['C']
) may lead to problems.
See:
set_value()
is deprecated.
Starting from the release 0.23.4, Pandas "announces the future"...
>>> df
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 245.0
2 Chevrolet Malibu 190.0
>>> df.set_value(2, 'Prices (U$)', 240.0)
__main__:1: FutureWarning: set_value is deprecated and will be removed in a future release.
Please use .at[] or .iat[] accessors instead
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 245.0
2 Chevrolet Malibu 240.0
Considering this advice, here's a demonstration of how to use them:
>>> df.iat[1, 1] = 260.0
>>> df
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 260.0
2 Chevrolet Malibu 240.0
>>> df.at[2, "Cars"] = "Chevrolet Corvette"
>>> df
Cars Prices (U$)
0 Audi TT 120.0
1 Lamborghini Aventador 260.0
2 Chevrolet Corvette 240.0
References:
I tested and the output is df.set_value
is little faster, but the official method df.at
looks like the fastest non deprecated way to do it.
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(100, 100))
%timeit df.iat[50,50]=50 # ✓
%timeit df.at[50,50]=50 # ✔
%timeit df.set_value(50,50,50) # will deprecate
%timeit df.iloc[50,50]=50
%timeit df.loc[50,50]=50
7.06 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.52 µs ± 64.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
3.68 µs ± 80.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
98.7 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
109 µs ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Note this is setting the value for a single cell. For the vectors loc
and iloc
should be better options since they are vectorized.
You can also use a conditional lookup using .loc
as seen here:
df.loc[df[<some_column_name>] == <condition>, [<another_column_name>]] = <value_to_add>
where <some_column_name
is the column you want to check the <condition>
variable against and <another_column_name>
is the column you want to add to (can be a new column or one that already exists). <value_to_add>
is the value you want to add to that column/row.
This example doesn't work precisely with the question at hand, but it might be useful for someone wants to add a specific value based on a condition.
Soo, your question to convert NaN at ['x',C] to value 10
the answer is..
df['x'].loc['C':]=10
df
alternative code is
df.loc['C':'x']=10
df
From version 0.21.1 you can also use .at
method. There are some differences compared to .loc
as mentioned here - pandas .at versus .loc, but it's faster on single value replacement