问题
On the B-cycle page(www.bcycle.com/whowantsitmore.aspx) I am trying to scrape locations and values of votes.
The URL http://mapservices.bcycle.com/bcycleservice.asmx is a SOAP service.
Based on the documentation I believe I am doing it correctly but I get an error due to parsing the input parameters. Even calling a function without parameters also creates errors.
# working with SOAP
#install.packages("SSOAP", repos="http://www.omegahat.org/R", dependencies = T, type = "source")
library(SSOAP)
# Process the Web Service Definition Language (WSDL) file
bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL")
# Generate functions based on definitions to access the different data sets
bcycle.interface <- genSOAPClientInterface(bcycle.asmx@operations[[1]], def = bcycle.asmx, bcycle.asmx@name, verbose=T)
# Get the data by requesting the number of cities, username and password (yes it is public)
bcycle.interface@functions$getCities("10","bcycle","c@rbont0ns")
# receive error: Error in as(parameters, "limit.userName.pw") :
# no method or default for coercing "character" to "limit.userName.pw"
This is due to the following code in the function:
function(parameters = list(...),... etc) {
...
as(parameters, "limit.userName.pw")
...
}
I have therefore tried to use the .SOAP function directly:
# Using RCurl library
library(RCurl)
# set up curl options
curl.opts <- curlOptions(
verbose=T,
header=T,
cookie="ASP.NET_SessionId=dv25ws551nquoezqwq3iu545;__utma=27517231.373920809.1357910914.1357910914.1357912862.2;__utmc=27517231;__utmz=27517231.1357910914.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);__utmb=27517231.13.10.1357912862",
httpheader = c('Content-Type' = 'text/xml; charset=utf-8', Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),
followLocation = TRUE,
useragent = "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0")
# Define header and submit request
bcycle.server <- SOAPServer("http://mapservices.bcycle.com/bcycleservice.asmx")
.SOAP(bcycle.server,
"getCities",
limit=250,userName="bCycle",pw="c@rbont0ns",
action="http://bcycle.com/getCities",
xmlns="http://bcycle.com/",
.opts=curl.opts,
.literal=T,
nameSpaces = "1.2",
elementFormQualified = T,
.returnNodeName = 'getCitiesResponse',
.soapHeader = NULL)
I manage to connect to their server but receive the error:
System.Web.Services.Protocols.SoapException:
Server did not recognize the value of HTTP Header SOAPAction:
http://bcycle.com/getCities#getCities
These are the options I have tried to date without success.
Using Python I was able to make the request for getCities but received nothing back.
import suds
client = suds.client.Client('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL')
print client # prints WSDL info
print client.service.getCities(10,'bcycle','c@rbont0ns') #prints nothing
I'm really interested in keeping this R focused but using python may provide easier insight into what the problem may be.
Any ideas?
回答1:
Try correcting username and explicitly naming parameters:
library(SSOAP)
bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL")
bcycle.interface <- genSOAPClientInterface(bcycle.asmx@operations[[1]],
def = bcycle.asmx, bcycle.asmx@name, verbose=T)
out <- bcycle.interface@functions$getCities(
limit="10",userName="bCycle",pw="c@rbont0ns")
#> out[[1]]@
#out[[1]]@zip out[[1]]@state_name
#out[[1]]@pop out[[1]]@latitude
#out[[1]]@ambassador_count out[[1]]@longitude
#out[[1]]@city_name
out[[1]]@city_name
#[1] "toledo"
Python call will also work with the corrected username
import suds
client = suds.client.Client('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL')
client.service.getCities(10,'bCycle','c@rbont0ns')
(ArrayOfCities){
Cities[] =
(Cities){
zip = "43606"
pop = 337362
ambassador_count = 455261
city_name = "toledo"
state_name = "oh"
latitude = 41.6743
longitude = -83.6029
},............
来源:https://stackoverflow.com/questions/14319462/using-r-soap-ssoap-to-retrieve-data-scrape