问题
I am interested in analyzing balance sheets and income statements using R. I have seen that there are R packages that pull information from Yahoo and Google Finance, but all the examples I have seen concern historical stock price information. Is there a way I can pull historical information from balance sheets and income statements using R?
回答1:
You are making the common mistake of confusing 'access to Yahoo or Google data' with 'everything I see on Yahoo or Google Finance can be downloaded'.
When R functions download historical stock price data, they almost always access an interface explicitly designed for this purpose as e.g. a cgi handler providing csv files given a stock symbol and start and end date. So this easy as all we need to do is form the appropriate query, hit the webserver, fetch the csv file an dparse it.
Now balance sheet information is (as far as I know) not available in such an interface. So you will need to 'screen scrape' and parse the html directly.
It is not clear that R is the best tool for this. I am aware of some Perl modules for the purpose of getting non-time-series data off Yahoo Finance but have not used them.
回答2:
I have found on the net only a partial solution to your issue, for I managed to retrieve only the Balance sheet info and financial statement for one year. I don't know how to do it for more years. There is a package in R, called quantmod, which you can install from CRAN
install.packages('quantmod')
Then you can do the following: Suppose you want to get the financial info from a company listed at NYSE : General Electric. ticker: GE
library(quantmod)
getFinancials('GE')
viewFinancials(GE.f)
To get only the income statement, reported anually, as a data frame use this:
viewFinancials(GE.f, "IS", "A")
Please let me know if you find out how to do this for multiple years.
回答3:
The question you want to ask, and get an answer to!, is where can I get free XBRL data for analysing corporate balance sheets, and is there a library for consuming such data in R?
XBRL (Extensible Business Reporting Language - http://en.wikipedia.org/wiki/XBRL) is a standard for marking up accounting statments (income statements, balance sheets, profit & loss statements) in XML format such that they can easily be parsed by computer and put into a spreadsheet.
As far as I know, a lot of corporate regulators (e.g. the SEC in the US, ASIC in Australia) are encouraging the companies under their jurisdiction to report using such a format, or running pilots, but I don't believe it has been mandated at this point. If you limited your investment universe (I am assuming you want this data in electronic format for investment purposes) to firms that have made their quarterly reports freely available in XBRL form, I expect you will have a pretty short list of firms to invest in!
Bloomberg, Reuters et al all have pricey feeds for obtaining corporate fundamental data. There may also be someone out there running a tidy business publishing balance sheets in XBRL format. Cheaper, but still paid for, are XIgnite's xFundamentals and xGlobalFundamentals web services, but you aren't getting full balance sheet data from them.
回答4:
to read-in the financial information try this function ( I picked it up several months ago and made some small adjustments)
require(XML)
require(plyr)
getKeyStats_xpath <- function(symbol) {
yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")
#search for <td> nodes anywhere that have class 'yfnc_tablehead1'
nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")
if(length(nodes) > 0 ) {
measures <- sapply(nodes, xmlValue)
#Clean up the column name
measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))
#Remove dups
dups <- which(duplicated(measures))
#print(dups)
for(i in 1:length(dups))
measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")
#use siblings function to get value
values <- sapply(nodes, function(x) xmlValue(getSibling(x)))
df <- data.frame(t(values))
colnames(df) <- measures
return(df)
} else {
break
}
}
to use it, compare for example 3 companies and write the data into a csv-file do the following:
tickers <- c("AAPL","GOOG","F")
stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)
Just tried it. Still working.
UPDATE as Yahoo changed it’s web site layout:
The function above does not work anymore as Yahoo again changed its web site layout. Fortunately its still easy to get the financial infos as the tags for getting fundamental data have not been changed. example for downloading a file with eps and P/E ratio for MSFT, AAPL and Ford insert the following into your browser:
http://finance.yahoo.com/d/quotes.csv?s=MSFT+AAPL+F&f=ser
and after entering the above URL into your browser’s address bar and hitting return/enter. The CSV will be automatically downloaded to your computer and you should get the cvs file as shown below (data as 7/22/2016):
some yahoo tags for fundamental data:
回答5:
Taking the last two comments into consideration, you may be able to acquire corporate financial statements economically using EdgardOnline. It isn't free, but is less expensive than Bloomberg and Reuters. Another thing to consider is financial reporting normalization/standardized. Just because two companies are in the same industry and sell similar products does not necessarily mean that if you laid the two companies' income statements or balance sheets side by side, that reporting items would necessarily line up. Compustat has normalized/standardized financial reports.
回答6:
I don't know anything about R, but assuming that it can call a REST API and consume data in XML form, you can try the Mergent Company Fundamentals API at http://www.mergent.com/servius/ - there's lots of very detailed financial statement data (balance sheets / income statements / cashflow statements / ratios), standardized across companies, going back more than 20 years
回答7:
I have written a C# program that I think does what you want. It parses the html from nasdaq.com pages. It parses html and creates 1 csv file per stock that includes income statement, cash flow, and balance sheet values going back 5 - 10 years depending on the age of the stock. I am now working to add some analysis calculations (mostly historic ratios at this point). I'm interested in learning about R and it's applications to fundamental analysis. Maybe we can help each other.
回答8:
I recently found this R package on CRAN. Which does exactly what you are asking I believe.
XBRL: Extraction of business financial information from XBRL documents
回答9:
You can get all three types of financial statements from Intrinio in R for free. Additionally, you can get as reported statements and standardized statements. The problem with pulling XBRL filings from the SEC is that there is no standardized option, which means you have to manually map financial statement items if you want to do cross equity comparisons. Here is an example:
#Install httr, which you need to request data via API
install.packages("httr")
require("httr")
#Install jsonlite which parses JSON
install.packages("jsonlite")
require("jsonlite")
#Create variables for your usename and password, get those at intrinio.com/login
username <- "Your_API_Username"
password <- "Your_API_Password"
#Making an api call for roic. This puts together the different parts of the API call
base <- "https://api.intrinio.com/"
endpoint <- "financials/"
type <- "standardized"
stock <- "YUM"
statement <- "income_statement"
fiscal_period <- "Q2"
fiscal_year <- "2015"
#Pasting them together to make the API call
call1 <- paste(base,endpoint,type,"?","identifier","=", stock, "&","statement","=",statement,"&","fiscal_period",
"=", fiscal_period, "&", "fiscal_year", "=", fiscal_year, sep="")
# call1 Looks like this "https://api.intrinio.com/financials/standardized?identifier=YUM&statement=income_statement&fiscal_period=Q2&fiscal_year=2015"
#Now we use the API call to request the data from Intrinio's database
YUM_Income <- GET(call1, authenticate(username,password, type = "basic"))
#That gives us the ROIC value, but it isn't in a good format so we parse it
test1 <- unlist(content(YUM_Income, "text"))
#Convert from JSON to flattened list
parsed_statement <- fromJSON(test1)
#Then make your data frame:
df1 <- data.frame(parsed_statement)
I wrote this script to make it easy to change out the ticker, dates, and statement type so you can get the financial statement for any US company for any period.
回答10:
I actually do this in Google Sheets. I thought it to be the easiest way to do it as well and because it can pull real live data was another bonus point. Lastly it doesn't consume any of my space to save these statements.
=importhtml("http://investing.money.msn.com/investments/stock-income-statement/?symbol=US%3A"&B1&"&stmtView=Ann", "table",0)
where B1 cell contains the ticker.
You can do the same thing for balance sheet, and cash flow as well.
来源:https://stackoverflow.com/questions/2614767/using-r-to-analyze-balance-sheets-and-income-statements