问题
I am trying to extract financial statement information based on type of the statement.
Let me explain to you in a little more details.
I want to extract the income statement, balance sheet and cash flow statement from an XBRL instance – especially US GAAP.
For me, the perfect solution would be to have tags in the XML file in such a way that I can extract the income statement with tag <incomestatement>
, balance sheet with <balancesheet>
and cash flow with <cashflow>
.
Please help me here. I am a novice and do not posses much background in XBRL.
回答1:
Fortunately, it is not that difficult to extract financial statements. Here is how I was able to extract income statement info:
- Use arelle web server to get the complete fact table as shown below: http://localhost:8080/rest/xbrl/view?file=c:/Python/SEC-EDGAR/sec/2017/01/0001530425-0001477932-17-000505-xbrl.zip&view=factTable&media=xml
Replace the file="" parameter with your own path. You can also substitute url for file parameter
- Once you have xml fact table in xml format, extract role nodes where for income statement, you can look for "StatementsOfOperations". Even though there are a few variations for income statement role id, they are not that many.
回答2:
As far as I recall, the right place to look at is the user-friendly labels associated with these roles.
The SEC places restrictions on how these labels look like (e.g., paragraph 6.7.12 of the Edgar Filing Manual), e.g. 02 - Statement - Balance Sheet
. The income statement, cash flow statement and balance sheet are commonly found in labels with Statement
(as opposed to Disclosure
, Document
, Schedule
) between the two dashes.
The third part of the label itself will tell you where to find the income statement/cash flow statement/balance sheet, however the exact labels may vary between filers. Also, there are several kinds of these (consolidated vs. unconsolidated, classified vs. unclassified, etc), and the complexity is further increased because sometimes, the same filing may contain several versions (consolidated and unconsolidated), so that you need some domain expertise to decide which one you need.
In a nutshell, you will need to do some trial and error on real filings in order to find the right algorithm to filter these labels.
What should help you though, is that Charles Hoffman has done some research on this, which for example can be found here (section 1.5).
来源:https://stackoverflow.com/questions/43543151/arelle-webserver-how-to-extract-the-income-statement-from-an-xbrl-filing