getFinancials (quantmod) and tq_get (tidy quant) not working?

前端 未结 3 1920
青春惊慌失措
青春惊慌失措 2021-01-23 01:32

I\'m getting the same error in both quantmod and tinyquant for financials data. Can anyone see if this is reproducable? Is this a google finance server issue? None of the bel

相关标签:
3条回答
  • 2021-01-23 02:14

    I tweaked the scrapy_stocks function to accommodate the Yahoo page update. I haven't thoroughly vetted this solution, but it seems to work well in all my trials thus far. Please be aware of two things:

    1. I don't think this would work if you have Yahoo Premium. I don't have it, so I can't test it. But if you do, it shouldn't be too difficult to update.
    2. I don't have a lot of experience with rvest, but because of the nature of the page, it had to set the function such that if there is one value that is missing, the entire row is missing.

    Try this:

    scrapy_stocks2 <- function(stock){
      if ("rvest" %in% installed.packages()) {
        library(rvest)
      }else{
        install.packages("rvest")
        library(rvest)
      }
      if ("xml2" %in% installed.packages()) {
        library(xml2)
      }else{
        install.packages("xml2")
        library(xml2)
      }
      for (stocknum in 1:length(stock)) {
        tryCatch(
          {
            # Income Statement
            url <- "https://finance.yahoo.com/quote/"
            url <- paste0(url,stock[stocknum],"/financials?p=",stock[stocknum])
            wahis.session <- html_session(url)  
    
            nodes <- wahis.session %>%
              html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[4]//span')
    
            yh_data <- nodes %>% 
              xml_text() %>% 
              gsub(pattern = ',', replacement = '')
            colnums <- 1:6
            col_nms <- yh_data[colnums]
            yh_data <- yh_data[-colnums]
    
            lab_inds <- nodes %>% 
              html_attr(name = 'class') == "Va(m)"
            lab_inds[is.na(lab_inds)] <- FALSE
    
            lab_inds <- lab_inds[-colnums]
            data <- matrix(NA, nrow = sum(lab_inds), ncol = 5, dimnames = list(yh_data[lab_inds], col_nms[-1]))
            row_num <- 1
            for (i in 2:(length(lab_inds)-4)) {
              t_ind <- !lab_inds[i:(i+4)]
              if (sum(t_ind) == 5) {
                data[row_num, 1:5] <- as.numeric(yh_data[i:(i+4)])
              }
              if (lab_inds[i]) {
                row_num <- row_num+1
              }
            }
    
            temp1 <- as.data.frame(data)
            print(paste(stock[stocknum],'   Income Statement Success'))
    
            # Balance Sheet
            url <- "https://finance.yahoo.com/quote/"
            url <- paste0(url,stock[stocknum],"/balance-sheet?p=",stock[stocknum])
            wahis.session <- html_session(url)  
    
            nodes <- wahis.session %>%
              html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[4]/div[1]/div[1]//span')
    
            yh_data <- nodes %>% 
              xml_text() %>% 
              gsub(pattern = ',', replacement = '')
    
            colnums <- 1:5
            col_nms <- yh_data[colnums]
            yh_data <- yh_data[-colnums]
    
            lab_inds <- nodes %>% 
              html_attr(name = 'class') == "Va(m)"
    
            lab_inds[is.na(lab_inds)] <- FALSE
    
            lab_inds <- lab_inds[-colnums]
            data <- matrix(NA, nrow = sum(lab_inds), ncol = 4, dimnames = list(yh_data[lab_inds], col_nms[-1]))
            row_num <- 1
            for (i in 2:(length(lab_inds)-3)) {
              t_ind <- !lab_inds[i:(i+3)]
              if (sum(t_ind) == 4) {
                data[row_num, 1:4] <- as.numeric(yh_data[i:(i+3)])
              }
              if (lab_inds[i]) {
                row_num <- row_num+1
              }
            }
    
            temp2 <- as.data.frame(data)
    
            print(paste(stock[stocknum],'   Balance Sheet Success'))
    
            # Cash Flow
            url <- "https://finance.yahoo.com/quote/"
            url <- paste0(url,stock[stocknum],"/cash-flow?p=",stock[stocknum])
            wahis.session <- html_session(url)
            nodes <- wahis.session %>%
              html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[4]/div[1]/div[1]//span')
    
            yh_data <- nodes %>% 
              xml_text() %>% 
              gsub(pattern = ',', replacement = '')
            colnums <- 1:6
            col_nms <- yh_data[colnums]
            yh_data <- yh_data[-colnums]
    
            lab_inds <- nodes %>% 
              html_attr(name = 'class') == "Va(m)"
            lab_inds[is.na(lab_inds)] <- FALSE
    
            lab_inds <- lab_inds[-colnums]
            data <- matrix(NA, nrow = sum(lab_inds), ncol = 5, dimnames = list(yh_data[lab_inds], col_nms[-1]))
            row_num <- 1
            for (i in 2:(length(lab_inds)-4)) {
              t_ind <- !lab_inds[i:(i+4)]
              if (sum(t_ind) == 5) {
                data[row_num, 1:5] <- as.numeric(yh_data[i:(i+4)])
              }
              if (lab_inds[i]) {
                row_num <- row_num+1
              }
            }
    
            temp3 <- as.data.frame(data)
    
            print(paste(stock[stocknum],'   Cash Flow Statement Success'))
    
            assign(paste0(stock[stocknum],'.f'),value = list(IS = temp1,BS = temp2,CF = temp3),envir = parent.frame())
    
          },
          error = function(cond){
            message(stock[stocknum], "Give error ",cond)
          }
        )
      }
    }
    
    
    
    0 讨论(0)
  • 2021-01-23 02:28

    Hi @Joe I faced the same problem, because google change its page, so I wrote a function to get data from Yahoo Finance. Its output is similar to getFin. I hope it can help you.

    scrapy_stocks <- function(stock){
        if ("rvest" %in% installed.packages()) {
                library(rvest)
        }else{
                install.packages("rvest")
                library(rvest)
        }
        for (i in 1:length(stock)) {
                tryCatch(
                        {
                                url <- "https://finance.yahoo.com/quote/"
                                url <- paste0(url,stock[i],"/financials?p=",stock[i])
                                wahis.session <- html_session(url)                                
                                p <-    wahis.session %>%
                                        html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[3]/table')%>%
                                        html_table(fill = TRUE)
                                IS <- p[[1]]
                                colnames(IS) <- paste(IS[1,])
                                IS <- IS[-c(1,5,12,20,25),]
                                names_row <- paste(IS[,1])
                                IS <- IS[,-1]
                                IS <- apply(IS,2,function(x){gsub(",","",x)})
                                IS <- as.data.frame(apply(IS,2,as.numeric))
                                rownames(IS) <- paste(names_row)
                                temp1 <- IS
                                url <- "https://finance.yahoo.com/quote/"
                                url <- paste0(url,stock[i],"/balance-sheet?p=",stock[i])
                                wahis.session <- html_session(url)
                                p <-    wahis.session %>%
                                        html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[3]/table')%>%
                                        html_table(fill = TRUE)
                                BS <- p[[1]]
                                colnames(BS) <- BS[1,]
                                BS <- BS[-c(1,2,17,28),]
                                names_row <- BS[,1]
                                BS <- BS[,-1] 
                                BS <- apply(BS,2,function(x){gsub(",","",x)})
                                BS <- as.data.frame(apply(BS,2,as.numeric))
                                rownames(BS) <- paste(names_row)
                                temp2 <- BS
                                url <- "https://finance.yahoo.com/quote/"
                                url <- paste0(url,stock[i],"/cash-flow?p=",stock[i])
                                wahis.session <- html_session(url)
                                p <-    wahis.session %>%
                                        html_nodes(xpath = '//*[@id="Col1-1-Financials-Proxy"]/section/div[3]/table')%>%
                                        html_table(fill = TRUE)
                                CF <- p[[1]]
                                colnames(CF) <- CF[1,]
                                CF <- CF[-c(1,3,11,16),]
                                names_row <- CF[,1]
                                CF <- CF[,-1] 
                                CF <- apply(CF,2,function(x){gsub(",","",x)})
                                CF <- as.data.frame(apply(CF,2,as.numeric))
                                rownames(CF) <- paste(names_row)
                                temp3 <- CF
                                assign(paste0(stock[i],'.f'),value = list(IS = temp1,BS = temp2,CF = temp3),envir = parent.frame())
    
                        },
                        error = function(cond){
                                message(stock[i], "Give error ",cond)
                        }
                )
        }
    }
    

    You can call it as scrapy_stocks(c("AAPL","GOOGL")) and access its data as AAPL.f$IS,AAPL.f$BS or AAPL.f$CF.

    0 讨论(0)
  • 2021-01-23 02:29

    Yes I get the same issue for the past couple of days as well. I think it may have to do with a change on the part of Google Finance. The site is now different and url as well.

    0 讨论(0)
提交回复
热议问题