five

Generative AI rewritten SEC filings

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/xxpjm7pxbg
下载链接
链接失效反馈
官方服务:
资源简介:
The rewritten filings in this dataset were generated using large language models from OpenAI, specifically GPT-4o and GPT-4o-mini. The rewriting process focused on the Management Discussion and Analysis (MD&A) part (section 7 and section 2, respectively) of 10-K and 10-Q filings, with the goal of maintaining the original content while improving the sentiment. There are 11,266 10-K and 32,620 10-Q filings. The sample was selected ensuring neutrality by considering both the year and sector. This method ensures a balanced representation across different time periods and industries, avoiding biases related to specific sectors or years in the rewritten filings. The rewritten filings in this dataset are saved in text files named according to the format: cik + '_' + accession number of the filing + '_section' + section number + '_' + model + '.txt' The following query was used: Please rewrite the provided MD&A section of a 10-K filing. Your goal is to create a new version that maintains the original meaning, key details, and financial information, but with more positive wording and phrasing. Ensure the rewritten text is coherent, professionally written, and retains the appropriate tone for a financial report. Also, enhance the positive sentiment by highlighting achievements, growth, and opportunities, while preserving all factual content. \n\nOriginal Text:
创建时间:
2024-10-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作