how to replace one part in url by using R

别等时光非礼了梦想. 提交于 2019-12-13 09:39:30

问题


Currently I have the website

http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=1&showViewpoints=0&sortBy=bySubmissionDateDescending

I want to replace this part

pageNumber=1

to be replaced with a sequence of numbers such as 1,2,3,.....n

I know I need to use the paste function. But can do I locate this number and replace it?


回答1:


You can use the parseQueryString function from the shiny package or parse_url and build_url from httr package.

require(shiny)
testURL <- "<http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=1&showViewpoints=0&sortBy=bySubmissionDateDescending>"
parseURL <- parseQueryString(testURL)
parseURL$pageNumber <- 4
newURL <- paste(names(parseURL), parseURL, sep = "=", collapse="&")

require(httr)
testURL <- "<http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=1&showViewpoints=0&sortBy=bySubmissionDateDescending>"
parseURL <- parse_url(testURL)
parseURL$query$pageNumber <- 4
newURL <- build_url(parseURL)



回答2:


Try this:

# inputs
URL1 <- "...whatever...&pageNumber=1"
i <- 2

URL2 <- sub("pageNumber=1", paste0("pageNumber=", i), URL1)

or using a perl zero width regex:

URL2 <- sub("(?<=pageNumber=)1", i, URL1, perl = TRUE)

If we know that there is no 1 prior to pageNumber, as is the case here, then it simplifies to just:

URL2 <- sub(1, i, URL1)



回答3:


Another very simple approach is to use sprintf:

sprintf('http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=%s&showViewpoints=0&sortBy=bySubmissionDateDescending', 
        1:10)

In the above code, the %s in the string provided as the first argument is replaced by each element of the vector provided in the second argument, in turn.

See ?sprintf for more details about this very handy string manipulation function.




回答4:


simplest approach would be splitting the string to

var part1 = " http://www.amazon.com/Apple-generation-Tablet-processor-White/product-reviews/B0047DVWLW/ref=cm_cr_pr_btm_link_2?ie=UTF8&pageNumber=";

var number =1;

var part2 = "&showViewpoints=0&sortBy=bySubmissionDateDescending"

link = part1+number+part2

another approach would be to use string.replace("pageNumber=1","pageNumber=2");

and another option would be to use regex but im not good with that youll have to do some googling.




回答5:


i figure it out now, the code is here.

listurl<-paste("http://rads.stackoverflow.com/amzn/click/B0047DVWLW",1:218)
ipadlisturl<-paste(listurl,"&showViewpoints=0&sortBy=bySubmissionDateDescending")


来源:https://stackoverflow.com/questions/22393694/how-to-replace-one-part-in-url-by-using-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!