Data seems to be missing in Bigquery SEC Filing Dataset

不羁岁月 提交于 2020-07-23 06:18:11

问题


I was pleased recently to discover that Bigquery hosts a dataset of SEC filings. I am unable to find the actual text of the filings in the dataset however! This seems so obvious I must be missing something.

As an example, the 2018 Microsoft 10-K filing on the SEC website itself can be seen to have the 10-K text in which Item 7 includes the phrase "Management’s Discussion and Analysis of Financial Condition and Results". I searched for this phrase in the Dataset.

First, the following query should pull all the text from this filing:

SELECT *
FROM `bigquery-public-data.sec_quarterly_financials.txt`
WHERE submission_number="0001564590-18-019062"

The results of this query, when searched for the above phrase, finds nothing however.

A second attempt based on another StackOverflow answer gave me this, in which I try to search the entire dataset for that phrase in case it's stored in a different table:

SELECT *
FROM `bigquery-public-data.sec_quarterly_financials.*` t
WHERE REGEXP_CONTAINS(LOWER(TO_JSON_STRING(t)), r'/^discussion and analysis of financial condition$/')

No result!

I can clearly find the same SEC filing, and yet content within it seems to be missing. I've searched other phrases and sections too, the text seems not to be there. Yet, based on all the Google documentation I think it should be. What am I missing?

Alternatively, anyone know of another source for parsing sections of SEC 10-K filings or the like? That would be useful too and you could also answer this question with it.

来源:https://stackoverflow.com/questions/62706179/data-seems-to-be-missing-in-bigquery-sec-filing-dataset

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!