arthrod/new3_merged-ex-10-2022_part0_9204.59mb
收藏Hugging Face2024-12-17 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/arthrod/new3_merged-ex-10-2022_part0_9204.59mb
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个部分,如submission、header、document_from_text、document_metadata等,每个部分都有详细的字段描述。submission部分包括cik、company_name、form_type、date_filed、master_file、submission_filename、filing_url和accession_number等字段。header部分包括sec_document、acceptance_datetime、description、filing_form_type、submission_type、conformed_submission_type、period_of_report、conformed_period_of_report、standard_industrial_classification、classification_number、accession_number、public_document_count、company_name、sec_header、filing_date和sec-header-complete等字段。document_metadata部分包括document_type、sequence、document_filename、description和title等字段。数据集还包含_id、timestamp_collection、doc_url和raw_document_content等字段。数据集的分割信息显示,train分割包含25783个示例,总字节数为9428452783,下载大小为2224322328,数据集总大小为9428452783。
This dataset contains multiple features, primarily for analyzing company submission documents. The features include submission information (such as company code, company name, document type, submission date, etc.), document header information (such as document type, submission type, reporting period, etc.), document text content, and document metadata information. The dataset is divided into a training set, containing 25783 samples with a total size of 9428452783 bytes.
提供机构:
arthrod



