in a pandas dataframe how can I apply a sort of excel left(\'state\',2) to only take the first two letters. Ideally I want to learn how to use left,right and mid in a datafr
With regards to the mid, probably a short cut code would be df['state'].str[3,5]
this will start from the 3rd character and give you the 3rd and 4th character of the string.
First two letters for each value in a column:
>>> df['StateInitial'] = df['state'].str[:2]
>>> df
pop state year StateInitial
0 1.5 Auckland 2000 Au
1 1.7 Otago 2001 Ot
2 3.6 Wellington 2002 We
3 2.4 Dunedin 2001 Du
4 2.9 Hamilton 2002 Ha
For last two that would be df['state'].str[-2:]
. Don't know what exactly you want for middle, but you can apply arbitrary function to a column with apply
method:
>>> df['state'].apply(lambda x: x[len(x)/2-1:len(x)/2+1])
0 kl
1 ta
2 in
3 ne
4 il