问题
Currently I have the website
http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=1&showViewpoints=0&sortBy=bySubmissionDateDescending
I want to replace this part
pageNumber=1
to be replaced with a sequence of numbers such as 1,2,3,.....n
I know I need to use the paste
function. But can do I locate this number and replace it?
回答1:
You can use the parseQueryString
function from the shiny
package or parse_url
and build_url
from httr
package.
require(shiny)
testURL <- "<http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=1&showViewpoints=0&sortBy=bySubmissionDateDescending>"
parseURL <- parseQueryString(testURL)
parseURL$pageNumber <- 4
newURL <- paste(names(parseURL), parseURL, sep = "=", collapse="&")
require(httr)
testURL <- "<http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=1&showViewpoints=0&sortBy=bySubmissionDateDescending>"
parseURL <- parse_url(testURL)
parseURL$query$pageNumber <- 4
newURL <- build_url(parseURL)
回答2:
Try this:
# inputs
URL1 <- "...whatever...&pageNumber=1"
i <- 2
URL2 <- sub("pageNumber=1", paste0("pageNumber=", i), URL1)
or using a perl zero width regex:
URL2 <- sub("(?<=pageNumber=)1", i, URL1, perl = TRUE)
If we know that there is no 1 prior to pageNumber
, as is the case here, then it simplifies to just:
URL2 <- sub(1, i, URL1)
回答3:
Another very simple approach is to use sprintf
:
sprintf('http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=%s&showViewpoints=0&sortBy=bySubmissionDateDescending',
1:10)
In the above code, the %s
in the string provided as the first argument is replaced by each element of the vector provided in the second argument, in turn.
See ?sprintf
for more details about this very handy string manipulation function.
回答4:
simplest approach would be splitting the string to
var part1 = " http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=";
var number =1;
var part2 = "&showViewpoints=0&sortBy=bySubmissionDateDescending"
link = part1+number+part2
another approach would be to use string.replace("pageNumber=1","pageNumber=2");
and another option would be to use regex but im not good with that youll have to do some googling.
回答5:
i figure it out now, the code is here.
listurl<-paste("http://rads.stackoverflow.com/amzn/click/B0047DVWLW",1:218)
ipadlisturl<-paste(listurl,"&showViewpoints=0&sortBy=bySubmissionDateDescending")
来源:https://stackoverflow.com/questions/22393694/how-to-replace-one-part-in-url-by-using-r