five

A Multimodal Dataset of Financial Disclosures, MD&A, and Audit Opinions with Next-Year Bankruptcy Labels

收藏
DataCite Commons2025-10-08 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/A_Multimodal_Dataset_of_Financial_Disclosures_MD_A_and_Audit_Opinions_with_Next-Year_Bankruptcy_Labels/30305341
下载链接
链接失效反馈
官方服务:
资源简介:
We publicly release a <b>multimodal dataset</b> derived from <b>10-K annual reports</b> to support research on <b>next-year bankruptcy prediction</b>. For each report, we collected all <b>reported financial figures</b>, the corresponding <b>Management Discussion &amp; Analysis (MD&amp;A)</b>, and the <b>Audit Opinion</b> text, along with a <b>bankruptcy label</b> indicating whether the company filed for bankruptcy in the year following the report’s release.This dataset is designed to encourage research in this challenging area by presenting several key difficulties, including:<b>Extreme class imbalance</b><b>Multimodality</b> — integration of both tabular and textual data<b>Multisource heterogeneity</b> — signals from the three sources may align or even contradict<b>NLP-related challenges</b> — long documents with substantial portions of text that may be irrelevant to the bankruptcy outcomeThis dataset accompanies our published paper in 6th ACM International Conference on AI in Finance with title "<b>A</b> <b>Multimodal Alignment-Based Anomaly Detection Method for Bankruptcy Prediction</b>"<br>Authors:Andreas SiderasKonstantinos BougiatiotisElias ZavitsanosGeorgios PaliourasGeorge Vouros
提供机构:
figshare
创建时间:
2025-10-08
二维码
社区交流群
二维码
科研交流群
商业服务