getFinancials (quantmod) and tq_get (tidy quant) not working?

前端未结

关注

 3  1920

I\'m getting the same error in both quantmod and tinyquant for financials data. Can anyone see if this is reproducable? Is this a google finance server issue? None of the bel

相关标签:

3条回答

刺人心

2021-01-23 02:14

I tweaked the scrapy_stocks function to accommodate the Yahoo page update. I haven't thoroughly vetted this solution, but it seems to work well in all my trials thus far. Please be aware of two things:

I don't think this would work if you have Yahoo Premium. I don't have it, so I can't test it. But if you do, it shouldn't be too difficult to update.
I don't have a lot of experience with rvest, but because of the nature of the page, it had to set the function such that if there is one value that is missing, the entire row is missing.

Try this:

scrapy_stocks2 <- function(stock){
  if ("rvest" %in% installed.packages()) {
    library(rvest)
  }else{
    install.packages("rvest")
    library(rvest)
  }
  if ("xml2" %in% installed.packages()) {
    library(xml2)
  }else{
    install.packages("xml2")
    library(xml2)
  }
  for (stocknum in 1:length(stock)) {
    tryCatch(
      {
        # Income Statement
        url <- "https://finance.yahoo.com/quote/"
        url <- paste0(url,stock[stocknum],"/financials?p=",stock[stocknum])
        wahis.session <- html_session(url)  

        nodes <- wahis.session %>%
          html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[4]//span')

        yh_data <- nodes %>% 
          xml_text() %>% 
          gsub(pattern = ',', replacement = '')
        colnums <- 1:6
        col_nms <- yh_data[colnums]
        yh_data <- yh_data[-colnums]

        lab_inds <- nodes %>% 
          html_attr(name = 'class') == "Va(m)"
        lab_inds[is.na(lab_inds)] <- FALSE

        lab_inds <- lab_inds[-colnums]
        data <- matrix(NA, nrow = sum(lab_inds), ncol = 5, dimnames = list(yh_data[lab_inds], col_nms[-1]))
        row_num <- 1
        for (i in 2:(length(lab_inds)-4)) {
          t_ind <- !lab_inds[i:(i+4)]
          if (sum(t_ind) == 5) {
            data[row_num, 1:5] <- as.numeric(yh_data[i:(i+4)])
          }
          if (lab_inds[i]) {
            row_num <- row_num+1
          }
        }

        temp1 <- as.data.frame(data)
        print(paste(stock[stocknum],'   Income Statement Success'))

        # Balance Sheet
        url <- "https://finance.yahoo.com/quote/"
        url <- paste0(url,stock[stocknum],"/balance-sheet?p=",stock[stocknum])
        wahis.session <- html_session(url)  

        nodes <- wahis.session %>%
          html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[4]/div[1]/div[1]//span')

        yh_data <- nodes %>% 
          xml_text() %>% 
          gsub(pattern = ',', replacement = '')

        colnums <- 1:5
        col_nms <- yh_data[colnums]
        yh_data <- yh_data[-colnums]

        lab_inds <- nodes %>% 
          html_attr(name = 'class') == "Va(m)"

        lab_inds[is.na(lab_inds)] <- FALSE

        lab_inds <- lab_inds[-colnums]
        data <- matrix(NA, nrow = sum(lab_inds), ncol = 4, dimnames = list(yh_data[lab_inds], col_nms[-1]))
        row_num <- 1
        for (i in 2:(length(lab_inds)-3)) {
          t_ind <- !lab_inds[i:(i+3)]
          if (sum(t_ind) == 4) {
            data[row_num, 1:4] <- as.numeric(yh_data[i:(i+3)])
          }
          if (lab_inds[i]) {
            row_num <- row_num+1
          }
        }

        temp2 <- as.data.frame(data)

        print(paste(stock[stocknum],'   Balance Sheet Success'))

        # Cash Flow
        url <- "https://finance.yahoo.com/quote/"
        url <- paste0(url,stock[stocknum],"/cash-flow?p=",stock[stocknum])
        wahis.session <- html_session(url)
        nodes <- wahis.session %>%
          html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[4]/div[1]/div[1]//span')

        yh_data <- nodes %>% 
          xml_text() %>% 
          gsub(pattern = ',', replacement = '')
        colnums <- 1:6
        col_nms <- yh_data[colnums]
        yh_data <- yh_data[-colnums]

        lab_inds <- nodes %>% 
          html_attr(name = 'class') == "Va(m)"
        lab_inds[is.na(lab_inds)] <- FALSE

        lab_inds <- lab_inds[-colnums]
        data <- matrix(NA, nrow = sum(lab_inds), ncol = 5, dimnames = list(yh_data[lab_inds], col_nms[-1]))
        row_num <- 1
        for (i in 2:(length(lab_inds)-4)) {
          t_ind <- !lab_inds[i:(i+4)]
          if (sum(t_ind) == 5) {
            data[row_num, 1:5] <- as.numeric(yh_data[i:(i+4)])
          }
          if (lab_inds[i]) {
            row_num <- row_num+1
          }
        }

        temp3 <- as.data.frame(data)

        print(paste(stock[stocknum],'   Cash Flow Statement Success'))

        assign(paste0(stock[stocknum],'.f'),value = list(IS = temp1,BS = temp2,CF = temp3),envir = parent.frame())

      },
      error = function(cond){
        message(stock[stocknum], "Give error ",cond)
      }
    )
  }
}

0 讨论(0)

情深已故

2021-01-23 02:28

Hi @Joe I faced the same problem, because google change its page, so I wrote a function to get data from Yahoo Finance. Its output is similar to getFin. I hope it can help you.

scrapy_stocks <- function(stock){
    if ("rvest" %in% installed.packages()) {
            library(rvest)
    }else{
            install.packages("rvest")
            library(rvest)
    }
    for (i in 1:length(stock)) {
            tryCatch(
                    {
                            url <- "https://finance.yahoo.com/quote/"
                            url <- paste0(url,stock[i],"/financials?p=",stock[i])
                            wahis.session <- html_session(url)                                
                            p <-    wahis.session %>%
                                    html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[3]/table')%>%
                                    html_table(fill = TRUE)
                            IS <- p[[1]]
                            colnames(IS) <- paste(IS[1,])
                            IS <- IS[-c(1,5,12,20,25),]
                            names_row <- paste(IS[,1])
                            IS <- IS[,-1]
                            IS <- apply(IS,2,function(x){gsub(",","",x)})
                            IS <- as.data.frame(apply(IS,2,as.numeric))
                            rownames(IS) <- paste(names_row)
                            temp1 <- IS
                            url <- "https://finance.yahoo.com/quote/"
                            url <- paste0(url,stock[i],"/balance-sheet?p=",stock[i])
                            wahis.session <- html_session(url)
                            p <-    wahis.session %>%
                                    html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[3]/table')%>%
                                    html_table(fill = TRUE)
                            BS <- p[[1]]
                            colnames(BS) <- BS[1,]
                            BS <- BS[-c(1,2,17,28),]
                            names_row <- BS[,1]
                            BS <- BS[,-1] 
                            BS <- apply(BS,2,function(x){gsub(",","",x)})
                            BS <- as.data.frame(apply(BS,2,as.numeric))
                            rownames(BS) <- paste(names_row)
                            temp2 <- BS
                            url <- "https://finance.yahoo.com/quote/"
                            url <- paste0(url,stock[i],"/cash-flow?p=",stock[i])
                            wahis.session <- html_session(url)
                            p <-    wahis.session %>%
                                    html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[3]/table')%>%
                                    html_table(fill = TRUE)
                            CF <- p[[1]]
                            colnames(CF) <- CF[1,]
                            CF <- CF[-c(1,3,11,16),]
                            names_row <- CF[,1]
                            CF <- CF[,-1] 
                            CF <- apply(CF,2,function(x){gsub(",","",x)})
                            CF <- as.data.frame(apply(CF,2,as.numeric))
                            rownames(CF) <- paste(names_row)
                            temp3 <- CF
                            assign(paste0(stock[i],'.f'),value = list(IS = temp1,BS = temp2,CF = temp3),envir = parent.frame())

                    },
                    error = function(cond){
                            message(stock[i], "Give error ",cond)
                    }
            )
    }
}

You can call it as scrapy_stocks(c("AAPL","GOOGL")) and access its data as AAPL.f$IS,AAPL.f$BS or AAPL.f$CF.

0 讨论(0)

余生分开走

2021-01-23 02:29

Yes I get the same issue for the past couple of days as well. I think it may have to do with a change on the part of Google Finance. The site is now different and url as well.

0 讨论(0)
发布评论:

提交评论
- 加载中...