using the following documentation i have been trying to scrape a series of tables from marketwatch.com
here is the one represented by the code bellow:
That website doesn't use an html table, so html_table()
can't find anything. It actaully uses div
classes column
and data lastcolumn
.
So you can do something like
url <- "http://www.marketwatch.com/investing/stock/IRS/profile"
valuation_col <- url %>%
read_html() %>%
html_nodes(xpath='//*[@class="column"]')
valuation_data <- url %>%
read_html() %>%
html_nodes(xpath='//*[@class="data lastcolumn"]')
Or even
url %>%
read_html() %>%
html_nodes(xpath='//*[@class="section"]')
To get you most of the way there.
Please also read their terms of use - particularly 3.4.