Reading xbrl with python

南楼画角 提交于 2019-12-03 21:49:23

Try it with a $ dollar sign at the end to indicate not to match anything else following the dollar sign:

liabilities = xbrl.find_all(name=re.compile("(us-gaap:Liabilities$)",
                          re.IGNORECASE | re.MULTILINE))

I'd be very wary about using this approach to parsing XBRL (or indeed, any XML with namespaces in it). "us-gaap:Liabilities" is a QName, consisting of a prefix ("us-gaap") and a local name ("Liabilities"). The prefix is just a shorthand for a full namespace URI such as "http://fasb.org/us-gaap/2015-01-31", which is defined by a namespace declaration, usually at the top of the document. If you look at the top of the document you'll see something like:

xmlns:us-gaap="http://fasb.org/us-gaap/2015-01-31"

This means that within the scope of this document, "us-gaap" is taken to mean that full namespace URI.

XML creators are free to use whatever prefixes they want, so there is no guarantee that the element will actually be called "us-gaap:Liabilities" across all documents that you encounter.

beautifulsoup4 has very limited support for namespaces, so I wouldn't recommend it as a starting point for building an XBRL processor. It may be worth taking a look at the Arelle project, which is a full XBRL processor, and will make it easier to do other tasks such as finding the labels and other information associated with facts in the taxonomy.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!