Python - take out the data inside cell of dataframe to another cells

家住魔仙堡 提交于 2020-05-31 04:03:54

问题


This is the data in single cell of dataframe with 14 columns. Cell is the element of column. There are 45k+ this kind of cells, to do it manually is a hell.

one cell data

I'd like to do with this cell 3 things:

  1. move text part with address, state, zip - to another column;
  2. delete the hooks () of cell;
  3. separate for 2 columns longitude and latitude.

How it's possible to do?


回答1:


Here's a simple, working example with 2 data points:

text1 = """30881 EKLUTNA LAKE RD
CHUGIAK, AK 99567
(61.4478, -149.3136)"""

text2 = """30882 FAKE STR
CHUGIAK, AK 98817
(43.4478, -119.3136)"""

d = {'col1': [text1, text2]}

df = pd.DataFrame(data=d)

def fix(row):
  #We split the text by newline
  address, cp, latlong =  row.col1.split('\n')

  #We get the latitude and longitude by splitting by a comma
  latlong_vec = latlong[1:-1].split(',')

  #This part isn't really necessary but we create the variables for claity
  lat = float(latlong_vec[0])
  long = float(latlong_vec[1])

  return pd.Series([address + ". " + cp, lat, long])


df[['full address', 'lat', 'long']] = df.apply(fix, axis = 1)

Output of the 3 new columns:

df['full address']
0    30881 EKLUTNA LAKE RD. CHUGIAK, AK 99567
1           30882 FAKE STR. CHUGIAK, AK 98817

df['lat']

0    61.4478
1    43.4478
Name: lat, dtype: float64

df['long']

0   -149.3136
1   -119.3136
Name: long, dtype: float64

Name: full address, dtype: object



来源:https://stackoverflow.com/questions/61746266/python-take-out-the-data-inside-cell-of-dataframe-to-another-cells

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!