five

Open4Business (O4B)

收藏
arXiv2020-11-30 更新2024-06-21 收录
下载链接:
https://github.com/amanpreet692/Open4Business
下载链接
链接失效反馈
官方服务:
资源简介:
Open4Business (O4B) 是一个包含17,458篇开放获取商业文章及其参考摘要的数据集,由石溪大学计算机科学系创建。该数据集通过筛选开放获取的商业期刊文章,使用GROBID工具将PDF转换为结构化XML格式,并提取摘要作为参考。O4B数据集的特点是需要高度抽象和更简洁的摘要,适用于商业文档的自动摘要任务,特别是在商业和公司财务领域,如合并和收购的尽职调查过程中,能够节省处理大量复杂文档的时间和精力。

Open4Business (O4B) is a dataset containing 17,458 open-access business articles and their corresponding reference abstracts, developed by the Department of Computer Science at Stony Brook University. This dataset is constructed by first screening open-access business journal articles, converting their PDF documents into structured XML format using the GROBID tool, and extracting abstracts as reference materials. The O4B dataset features highly abstractive and concise abstracts, making it highly suitable for automatic summarization tasks of business documents, particularly in the business and corporate finance domains such as mergers and acquisitions due diligence processes, where it can save substantial time and effort in processing large volumes of complex documents.
提供机构:
石溪大学计算机科学系
创建时间:
2020-11-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作