I am trying to run some simple program to extract tables from html code. However, there seems to be some memory issue with readHTMLTable in XML package. Is there any way I c
As of XML 3.98-1.4 and R 3.1 on Win7, this problem can be solved perfectly by using the function free()
. But it does not work with readHTMLTable()
. The following code works perfectly.
library(XML)
a = readLines("http://en.wikipedia.org/wiki/2014_FIFA_World_Cup")
while(TRUE){
b = xmlParse(paste(a, collapse = ""))
#do something with b
free(b)
}
The xml2 package has similar issues and the memory can be released by using the function remove_xml()
followed by gc()
.