问题
I was pleased recently to discover that Bigquery hosts a dataset of SEC filings. I am unable to find the actual text of the filings in the dataset however! This seems so obvious I must be missing something.
As an example, the 2018 Microsoft 10-K filing on the SEC website itself can be seen to have the 10-K text in which Item 7 includes the phrase "Management’s Discussion and Analysis of Financial Condition and Results". I searched for this phrase in the Dataset.
First, the following query should pull all the text from this filing:
SELECT *
FROM `bigquery-public-data.sec_quarterly_financials.txt`
WHERE submission_number="0001564590-18-019062"
The results of this query, when searched for the above phrase, finds nothing however.
A second attempt based on another StackOverflow answer gave me this, in which I try to search the entire dataset for that phrase in case it's stored in a different table:
SELECT *
FROM `bigquery-public-data.sec_quarterly_financials.*` t
WHERE REGEXP_CONTAINS(LOWER(TO_JSON_STRING(t)), r'/^discussion and analysis of financial condition$/')
No result!
I can clearly find the same SEC filing, and yet content within it seems to be missing. I've searched other phrases and sections too, the text seems not to be there. Yet, based on all the Google documentation I think it should be. What am I missing?
Alternatively, anyone know of another source for parsing sections of SEC 10-K filings or the like? That would be useful too and you could also answer this question with it.
来源:https://stackoverflow.com/questions/62706179/data-seems-to-be-missing-in-bigquery-sec-filing-dataset