Raw SEC filings are sent in a SGML file - this parses that master submission into component documents, with content lines in list column 'TEXT'.

parse_submission(x, include.binary = T, include.content = T)

## Arguments

x - Input submission to parse. May be one of the following: URIURL to a SEC complete submission text file TextString with the full submission File pathPath to local file containing the submission - Default TRUE, determines if the content of binary documents is returned. - Default TRUE, determines if the content of documents is returned.

## Value

a dataframe with one row per document. For the metadata (TYPE, DESCRIPTION, FILENAME) it is important to note that these are provided by the filer and have little standardization or enforcement.

SEQUENCE

Sequence number of the file

TYPE

The type of document, e.g. 10-K, EX-99, GRAPHIC

DESCRIPTION

The type of document, e.g. 10-K, EX-99, GRAPHIC

FILENAME

The document's filename

TEXT

The text representation of the document. For text-based documents (txt, html) this is the actual file contents. For binary files (graphics, pdfs) this contains the uuencoded contents.

## Details

Most of the time the information you need along with the specific files will be available by using filing_documents, but there are scenarios where you may want to access the full contents of the master submission -

Old Submissions

Older submissions are not parsed into component documents by the SEC so access requires parsing the main filing

Full Document List

The SEC only provides what it considers the relevant documents, but filings often include many more ancillary files

If you're fetching many documents from a filing over many filings, there can be efficiency gains from just downloading a single file.

NOTE: non-text documents are uuencoded and need a separate decoder to be viewed.

## Examples

parse_submission(paste0('https://www.sec.gov/Archives/edgar/data/',
'37996/000003799617000084/0000037996-17-000084.txt'))[ ,
c('SEQUENCE', 'TYPE', 'DESCRIPTION', 'FILENAME')]#>   SEQUENCE    TYPE DESCRIPTION                        FILENAME
#> 1        1     8-K         8-K       ceostrategicupdate8-k.htm
#> 2        2   EX-99  EXHIBIT 99    exhibit99ceostrategicupd.htm
#> 3        3 GRAPHIC             exhibit99ceostrategicupd001.jpg
#> 4        4 GRAPHIC             exhibit99ceostrategicupd002.jpg
#> 5        5 GRAPHIC             exhibit99ceostrategicupd003.jpg
#> 6        6 GRAPHIC             exhibit99ceostrategicupd004.jpg
#> 7        7 GRAPHIC             exhibit99ceostrategicupd005.jpg