Pandas: get second character of the string, from every row

后端 未结 2 624
北海茫月
北海茫月 2021-01-19 00:26

I\'ve a array of data in Pandas and I\'m trying to print second character of every string in col1. I can\'t figure out how to do it. I can easily print the second character

相关标签:
2条回答
  • 2021-01-19 01:12

    As of Pandas 0.23.0, if your data is clean, you will find Pandas "vectorised" string methods via pd.Series.str will generally underperform simple iteration via a list comprehension or use of map.

    For example:

    from operator import itemgetter
    
    df = pd.DataFrame(['foo', 'bar', 'baz'], columns=['col1'])
    
    df = pd.concat([df]*100000, ignore_index=True)
    
    %timeit pd.Series([i[1] for i in df['col1']])            # 33.7 ms
    %timeit pd.Series(list(map(itemgetter(1), df['col1'])))  # 42.2 ms
    %timeit df['col1'].str[1]                                # 214 ms
    

    A special case is when you have a large number of repeated strings, in which case you can benefit from converting your series to a categorical:

    df['col1'] = df['col1'].astype('category')
    
    %timeit df['col1'].str[1]  # 4.9 ms
    
    0 讨论(0)
  • 2021-01-19 01:28

    You can use str to access the string methods for the column/Series and then slice the strings as normal:

    >>> df = pd.DataFrame(['foo', 'bar', 'baz'], columns=['col1'])
    >>> df
      col1
    0  foo
    1  bar
    2  baz
    
    >>> df.col1.str[1]
    0    o
    1    a
    2    a
    

    This str attribute also gives you access variety of very useful vectorised string methods, many of which are instantly recognisable from Python's own assortment of built-in string methods (split, replace, etc.).

    0 讨论(0)
提交回复
热议问题