问题
I need help drafting code that pulls public data directly from a website that is in Socrata format. Here is a link:
https://data.cityofchicago.org/Administration-Finance/Current-Employee-Names-Salaries-and-Position-Title/xzkq-xp2w
There is an API endpoint:
https://data.cityofchicago.org/resource/xzkq-xp2w.json
After the data is uploaded, null values in the "Annual Salary" should be replaced with 50000.
回答1:
We can use the RSocrata
package
library(RSocrata)
url <- "https://data.cityofchicago.org/resource/xzkq-xp2w.json"
data <- RSocrata::read.socrata(url)
head(data)
# name job_titles department full_or_part_time salary_or_hourly annual_salary typical_hours hourly_rate
#1 AARON, JEFFERY M SERGEANT POLICE F Salary 111444 <NA> <NA>
#2 AARON, KARINA POLICE OFFICER (ASSIGNED AS DETECTIVE) POLICE F Salary 94122 <NA> <NA>
#3 AARON, KIMBERLEI R CHIEF CONTRACT EXPEDITER DAIS F Salary 118608 <NA> <NA>
#4 ABAD JR, VICENTE M CIVIL ENGINEER IV WATER MGMNT F Salary 117072 <NA> <NA>
#5 ABARCA, FRANCES J POLICE OFFICER POLICE F Salary 48078 <NA> <NA>
The following will replace the NAs in annual_salary
with 50000
.
data[is.na(data$annual_salary),"annual_salary"] <- 50000
However, if you'd like to do what it suggests on the city of Chicago website, you could consider multipling typical_hours
with hourly_rate
to estimate salary.
ind <- is.na(data$annual_salary)
data[ind,]$annual_salary <- as.numeric(data[ind,]$typical_hours) * as.numeric(data[ind,]$hourly_rate) * 52
来源:https://stackoverflow.com/questions/61071854/need-help-pulling-json-data-with-rsocrata-from-a-website-api