问题
I have the following dataset for California housing data:
head(calif_cluster,15)
MedianHouseValue MedianIncome MedianHouseAge TotalRooms TotalBedrooms Population
1 190300 4.20510 16 2697.00 490.00 1462
2 150800 2.54810 33 2821.00 652.00 1206
3 252600 6.08290 17 6213.20 1276.05 3288
4 269700 4.03680 52 919.00 213.00 413
5 91200 1.63680 28 3072.00 790.00 1375
6 66200 2.18980 30 744.00 156.00 410
7 148800 2.63640 39 620.95 136.00 348
8 384800 4.46150 20 2270.00 498.00 1070
9 153200 2.75000 22 1931.00 445.00 1009
10 66200 1.60057 36 973.00 219.00 613
11 461500 3.78130 43 3070.00 668.00 1240
12 144600 2.85000 22 5175.00 1213.00 2804
13 143700 5.09410 8 6213.20 1276.05 3288
14 195500 5.30620 16 2918.00 444.00 1697
15 268800 2.42110 22 620.95 136.00 348
Households Latitude Longitude cluster_kmeans gender_dom marital race edu_level rental
1 515 38.48 -122.47 1 M other black jrcollege rented
2 640 38.00 -122.13 1 F other hispanic doctorate owned
3 1162 33.88 -117.79 3 M other white jrcollege owned
4 193 37.85 -122.25 1 M single others jrcollege owned
5 705 38.13 -122.26 1 F single white doctorate rented
6 165 38.96 -122.21 1 F single others jrcollege owned
7 125 34.01 -118.18 2 M married others postgrad owned
8 521 33.83 -118.38 2 F single white graduate rented
9 407 38.95 -121.04 1 M married others postgrad leased
10 187 35.34 -119.01 2 M single hispanic doctorate owned
11 646 33.76 -118.12 2 F other others highschl leased
12 1091 37.95 -122.05 3 M other white graduate rented
13 1162 36.87 -119.75 3 M other others postgrad leased
14 444 32.93 -117.13 2 M other asian jrcollege owned
15 125 37.71 -120.98 1 F single asian postgrad leased
As i have latitude & longitude information in the datasets, i would like to extract corresponding county
for the given geo information using R. Also is it possible to getting the capital city(or largest city) for each of the extracted counties .These could make my stratified analysis more insightful;intend to do some clustering/mapping exercise.
回答1:
take a look at ggmap::revgeocode
code
library(ggmap)
revgeocode(c(-122.47,38.48)) # longitude then latitude
# [1] "2233 Sulphur Springs Ave, St Helena, CA 94574, USA"
library(dplyr)
library(magrittr)
df12 %<>% rowwise %>% mutate(address = revgeocode(c(Longitude,Latitude))) %>% ungroup # add full address using google api through ggmap
df12 %<>% separate(address,c("street_address", "city","county","country"),remove=F,sep=",") # structure all the info you need
result
df12 %>% select(Longitude,Latitude,address,county)
# A tibble: 15 x 4
# Longitude Latitude address county
# * <dbl> <dbl> <chr> <chr>
# 1 -122.47 38.48 2233 Sulphur Springs Ave, St Helena, CA 94574, USA CA 94574
# 2 -122.13 38.00 3400-3410 Brookside Dr, Martinez, CA 94553, USA CA 94553
# 3 -117.79 33.88 19721 Bluefield Plaza, Yorba Linda, CA 92886, USA CA 92886
# 4 -122.25 37.85 6365 Florio St, Oakland, CA 94618, USA CA 94618
# 5 -122.26 38.13 119 Mimosa Ct, Vallejo, CA 94589, USA CA 94589
# 6 -122.21 38.96 Unnamed Road, Arbuckle, CA 95912, USA CA 95912
# 7 -118.18 34.01 4360-4414 Noakes St, Los Angeles, CA 90023, USA CA 90023
# 8 -118.38 33.83 903 Serpentine St, Redondo Beach, CA 90277, USA CA 90277
# 9 -121.04 38.95 14666-14690 Musso Rd, Auburn, CA 95603, USA CA 95603
# 10 -119.01 35.34 800 Ming Ave, Bakersfield, CA 93307, USA CA 93307
# 11 -118.12 33.76 6211-6295 E Marina Dr, Long Beach, CA 90803, USA CA 90803
# 12 -122.05 37.95 1120 Carey Dr, Concord, CA 94520, USA CA 94520
# 13 -119.75 36.87 1815-1899 E Pryor Dr, Fresno, CA 93720, USA CA 93720
# 14 -117.13 32.93 9010-9016 Danube Ln, San Diego, CA 92126, USA CA 92126
# 15 -120.98 37.71 748-1298 Claribel Rd, Modesto, CA 95356, USA CA 95356
data
df1 <- read.table(text = "MedianHouseValue MedianIncome MedianHouseAge TotalRooms TotalBedrooms Population
1 190300 4.20510 16 2697.00 490.00 1462
2 150800 2.54810 33 2821.00 652.00 1206
3 252600 6.08290 17 6213.20 1276.05 3288
4 269700 4.03680 52 919.00 213.00 413
5 91200 1.63680 28 3072.00 790.00 1375
6 66200 2.18980 30 744.00 156.00 410
7 148800 2.63640 39 620.95 136.00 348
8 384800 4.46150 20 2270.00 498.00 1070
9 153200 2.75000 22 1931.00 445.00 1009
10 66200 1.60057 36 973.00 219.00 613
11 461500 3.78130 43 3070.00 668.00 1240
12 144600 2.85000 22 5175.00 1213.00 2804
13 143700 5.09410 8 6213.20 1276.05 3288
14 195500 5.30620 16 2918.00 444.00 1697
15 268800 2.42110 22 620.95 136.00 348",header=T,stringsAsFactors=F)
df2 <- read.table(text = "Households Latitude Longitude cluster_kmeans gender_dom marital race edu_level rental
1 515 38.48 -122.47 1 M other black jrcollege rented
2 640 38.00 -122.13 1 F other hispanic doctorate owned
3 1162 33.88 -117.79 3 M other white jrcollege owned
4 193 37.85 -122.25 1 M single others jrcollege owned
5 705 38.13 -122.26 1 F single white doctorate rented
6 165 38.96 -122.21 1 F single others jrcollege owned
7 125 34.01 -118.18 2 M married others postgrad owned
8 521 33.83 -118.38 2 F single white graduate rented
9 407 38.95 -121.04 1 M married others postgrad leased
10 187 35.34 -119.01 2 M single hispanic doctorate owned
11 646 33.76 -118.12 2 F other others highschl leased
12 1091 37.95 -122.05 3 M other white graduate rented
13 1162 36.87 -119.75 3 M other others postgrad leased
14 444 32.93 -117.13 2 M other asian jrcollege owned
15 125 37.71 -120.98 1 F single asian postgrad leased",header=T,stringsAsFactors=F)
df12 <- cbind(df1,df2)
I don't think the library offers an option to get the capital or largest city in the county but I think you won't have too much trouble building a lookup table from online info.
来源:https://stackoverflow.com/questions/46150851/how-can-i-extract-california-county-locations-from-given-latitude-and-longitude