Function to convert coordinates in pandas series and append as additional series

孤街醉人 提交于 2019-12-22 01:19:28


I'm looking to take as series of co-ordinates stored in a pandas dataframe and define a function that will go through each entry, transform it (BNG Easting Northing to Lat and Long) and save it to a new column in same row. This function by Elise Huard looks like it should do just this

def proj_transform(df):
    #bng = pyproj.Proj(init='epsg:27700')
    bng = pyproj.Proj("+init=EPSG:27700")
    #wgs84 = pyproj.Proj(init='epsg:4326')
    wgs84 = pyproj.Proj("+init=EPSG:4326")
    lats = pd.Series()
    lons = pd.Series()
    for idx, val in enumerate(df['Easting']):
        lon, lat = pyproj.transform(bng,wgs84,df['Easting'][idx], df['Northing'][idx])
        lats.set_value(idx, lat)
        lons.set_value(idx, lon)
    df['lat'] = lats
    df['lon'] = lons
    return df

but im getting the following error once i try to run the function. Any advice on what might be causing it or an alternate approach as work round.

RuntimeError: non-convergent inverse meridional dist

Sample of data used;

Site Reference  LA Reference    Start Date  Easting Northing
0   380500145   NaN 20130101    105175.0    105175.0
1   380500128   NaN 20060331    104000.0    104000.0
2   380500085   NaN 20030401    105055.0    105055.0
3   380500008   NaN 19980930    108480.0    108480.0
4   380500009   NaN 19980930    105415.0    105415.0
5   380500136   SHLAA20100101   105081.0    105081.0
6   380500038   NaN 19980930    105818.0    105818.0


I think the function works fine but the formatting of your input is the culprit. In the fifth row of your sample data, there is no space between SHLAA and the date - they make as one expression into LA Ref column, while northings column gets NaN. This NaN value results in RuntimeError: b'non-convergent inverse meridional dist' in function pyproj.transform.

After adding a space there, plus some needed column names reformatting, it worked OK (or at least it looked so).

My code:

import pandas as pd
import pyproj
from inspect import cleandoc
from io import StringIO

s = '''
    Site_Reference  LA_Reference    Start_Date  eastings northings
    0   380500145   NaN 20130101    105175.0    105175.0
    1   380500128   NaN 20060331    104000.0    104000.0
    2   380500085   NaN 20030401    105055.0    105055.0
    3   380500008   NaN 19980930    108480.0    108480.0
    4   380500009   NaN 19980930    105415.0    105415.0
    5   380500136   SHLAA 20100101   105081.0    105081.0
    6   380500038   NaN 19980930    105818.0    105818.0
s = cleandoc(s)
df = pd.read_csv(StringIO(s), sep = '\s+')
   Site_Reference LA_Reference  Start_Date  eastings  northings
0       380500145          NaN    20130101    105175     105175
1       380500128          NaN    20060331    104000     104000
2       380500085          NaN    20030401    105055     105055
3       380500008          NaN    19980930    108480     108480
4       380500009          NaN    19980930    105415     105415
5       380500136        SHLAA    20100101    105081     105081
6       380500038          NaN    19980930    105818     105818

def proj_transform(df):
    bng = pyproj.Proj("+init=EPSG:27700")
    wgs84 = pyproj.Proj("+init=EPSG:4326")
    lats = pd.Series()
    lons = pd.Series()
    for idx, val in enumerate(df['eastings']):
        lon, lat = pyproj.transform(bng,wgs84,df['eastings'][idx], df['northings'][idx])
        lats.set_value(idx, lat)
        lons.set_value(idx, lon)
    df['lat'] = lats
    df['lon'] = lons
    return df

df_transformed = proj_transform(df)

   Site_Reference LA_Reference  Start_Date  eastings  northings        lat       lon
0       380500145          NaN    20130101    105175     105175  50.771035 -6.183048
1       380500128          NaN    20060331    104000     104000  50.759899 -6.198721
2       380500085          NaN    20030401    105055     105055  50.769898 -6.184649
3       380500008          NaN    19980930    108480     108480  50.802348 -6.138924
4       380500009          NaN    19980930    105415     105415  50.773309 -6.179846
5       380500136        SHLAA    20100101    105081     105081  50.770144 -6.184302
6       380500038          NaN    19980930    105818     105818  50.777128 -6.174468


Assuming pyproj.transform works correctly on single (easting, northing) coordinate pairs, then instead of:

for idx, val in enumerate(df['Easting']):
    lon, lat = pyproj.transform(bng,wgs84,df['Easting'][idx], df['Northing'][idx])
    lats.set_value(idx, lat)
    lons.set_value(idx, lon)


lons, lats = map(lambda x: pyproj.transform(bng, wgs84, x[0], x[1]),
                 zip(df['Easting'], df['Northing']))

And leave the rest unchanged.


This works:

arr = map(lambda x: pyproj.transform(bng, wgs84, x[0], x[1]),
          zip(df['eastings'], df['northings']))
lons, lats = map(array, zip(*arr)) 


As @ptrj notes;

RuntimeError: non-convergent inverse meridional dist

Was, in this instance, caused by NaN values in the data.

