Raw SEC filings are sent in a SGML file - this parses that master submission into component documents, with content lines in list column 'TEXT'.
parse_submission(x, include.binary = T, include.content = T)
x | - Input submission to parse. May be one of the following:
|
---|---|
include.binary | - Default TRUE, determines if the content of binary documents is returned. |
include.content | - Default TRUE, determines if the content of documents is returned. |
a dataframe with one row per document. For the metadata (TYPE, DESCRIPTION, FILENAME) it is important to note that these are provided by the filer and have little standardization or enforcement.
Sequence number of the file
The type of document, e.g. 10-K, EX-99, GRAPHIC
The type of document, e.g. 10-K, EX-99, GRAPHIC
The document's filename
The text representation of the document. For text-based documents (txt, html) this is the actual file contents. For binary files (graphics, pdfs) this contains the uuencoded contents.
Most of the time the information you need along with the specific files
will be available by using filing_documents
, but there are
scenarios where you may want to access the full contents of the master
submission -
Older submissions are not parsed into component documents by the SEC so access requires parsing the main filing
The SEC only provides what it considers the relevant documents, but filings often include many more ancillary files
If you're fetching many documents from a filing over many filings, there can be efficiency gains from just downloading a single file.
NOTE: non-text documents are uuencoded and need a separate decoder to be viewed.
# \donttest{ try( parse_submission(paste0('https://www.sec.gov/Archives/edgar/data/', '37996/000003799617000084/0000037996-17-000084.txt'))[ , c('SEQUENCE', 'TYPE', 'DESCRIPTION', 'FILENAME')] )#> SEQUENCE TYPE DESCRIPTION FILENAME #> 1 1 8-K 8-K ceostrategicupdate8-k.htm #> 2 2 EX-99 EXHIBIT 99 exhibit99ceostrategicupd.htm #> 3 3 GRAPHIC exhibit99ceostrategicupd001.jpg #> 4 4 GRAPHIC exhibit99ceostrategicupd002.jpg #> 5 5 GRAPHIC exhibit99ceostrategicupd003.jpg #> 6 6 GRAPHIC exhibit99ceostrategicupd004.jpg #> 7 7 GRAPHIC exhibit99ceostrategicupd005.jpg# }