I\'m trying to convert, for example, \'9¼\"\'to \'9.25\' but cannot seem to read the fraction correctly.
Here\'s the data I\'m working with:
library(XM
You can try to transform the unicode encoding to ASCII directly when reading the XML using a special return function:
library(stringi)
readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F,elFun=function(node) {
val = xmlValue(node); stri_trans_general(val,"latin-ascii")})
You can then use @Metrics' suggestion to convert it to numbers.
You could do for example, using @G. Grothendieck's function from this post clean up the Arms
data:
library(XML)
library(stringi)
library(gsubfn)
#the calc function is by @G. Grothendieck
calc <- function(s) {
x <- c(if (length(s) == 2) 0, as.numeric(s), 0:1)
x[1] + x[2] / x[3]
}
url <- paste("http://mockdraftable.com/players/2014/", sep = "")
combine<-readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F,elFun=function(node) {
val = xmlValue(node); stri_trans_general(val,"latin-ascii")})
names(combine) <- c("Name", "Pos", "Hght", "Wght", "Arms", "Hands",
"Dash40yd", "Dash20yd", "Dash10yd", "Bench", "Vert", "Broad",
"Cone3", "ShortShuttle20")
sapply(strapplyc(gsub('\"',"",combine$Arms), "\\d+"), calc)
#[1] 30.000 31.500 30.000 31.750 31.875 29.875 31.000 31.000 30.250 33.000 32.500 31.625 32.875
There might be some encoding issues depending on your machine (see the comments)