Set value for particular cell in pandas DataFrame using index

后端 未结 20 1611
野趣味
野趣味 2020-11-22 05:45

I\'ve created a Pandas DataFrame

df = DataFrame(index=[\'A\',\'B\',\'C\'], columns=[\'x\',\'y\'])

and got this

    x    y
A  NaN         


        
相关标签:
20条回答
  • 2020-11-22 06:39

    The recommended way (according to the maintainers) to set a value is:

    df.ix['x','C']=10
    

    Using 'chained indexing' (df['x']['C']) may lead to problems.

    See:

    • https://stackoverflow.com/a/21287235/1579844
    • http://pandas.pydata.org/pandas-docs/dev/indexing.html#indexing-view-versus-copy
    • https://github.com/pydata/pandas/pull/6031
    0 讨论(0)
  • 2020-11-22 06:43

    set_value() is deprecated.

    Starting from the release 0.23.4, Pandas "announces the future"...

    >>> df
                       Cars  Prices (U$)
    0               Audi TT        120.0
    1 Lamborghini Aventador        245.0
    2      Chevrolet Malibu        190.0
    >>> df.set_value(2, 'Prices (U$)', 240.0)
    __main__:1: FutureWarning: set_value is deprecated and will be removed in a future release.
    Please use .at[] or .iat[] accessors instead
    
                       Cars  Prices (U$)
    0               Audi TT        120.0
    1 Lamborghini Aventador        245.0
    2      Chevrolet Malibu        240.0
    

    Considering this advice, here's a demonstration of how to use them:

    • by row/column integer positions

    >>> df.iat[1, 1] = 260.0
    >>> df
                       Cars  Prices (U$)
    0               Audi TT        120.0
    1 Lamborghini Aventador        260.0
    2      Chevrolet Malibu        240.0
    
    • by row/column labels

    >>> df.at[2, "Cars"] = "Chevrolet Corvette"
    >>> df
                      Cars  Prices (U$)
    0               Audi TT        120.0
    1 Lamborghini Aventador        260.0
    2    Chevrolet Corvette        240.0
    

    References:

    • pandas.DataFrame.iat
    • pandas.DataFrame.at
    0 讨论(0)
  • 2020-11-22 06:46

    I tested and the output is df.set_value is little faster, but the official method df.at looks like the fastest non deprecated way to do it.

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(np.random.rand(100, 100))
    
    %timeit df.iat[50,50]=50 # ✓
    %timeit df.at[50,50]=50 #  ✔
    %timeit df.set_value(50,50,50) # will deprecate
    %timeit df.iloc[50,50]=50
    %timeit df.loc[50,50]=50
    
    7.06 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    5.52 µs ± 64.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    3.68 µs ± 80.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    98.7 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    109 µs ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    Note this is setting the value for a single cell. For the vectors loc and iloc should be better options since they are vectorized.

    0 讨论(0)
  • 2020-11-22 06:47

    You can also use a conditional lookup using .loc as seen here:

    df.loc[df[<some_column_name>] == <condition>, [<another_column_name>]] = <value_to_add>
    

    where <some_column_name is the column you want to check the <condition> variable against and <another_column_name> is the column you want to add to (can be a new column or one that already exists). <value_to_add> is the value you want to add to that column/row.

    This example doesn't work precisely with the question at hand, but it might be useful for someone wants to add a specific value based on a condition.

    0 讨论(0)
  • 2020-11-22 06:47

    Soo, your question to convert NaN at ['x',C] to value 10

    the answer is..

    df['x'].loc['C':]=10
    df
    

    alternative code is

    df.loc['C':'x']=10
    df
    
    0 讨论(0)
  • 2020-11-22 06:48

    From version 0.21.1 you can also use .at method. There are some differences compared to .loc as mentioned here - pandas .at versus .loc, but it's faster on single value replacement

    0 讨论(0)
提交回复
热议问题