FiLM
收藏arXiv2023-10-20 更新2024-06-21 收录
下载链接:
https://github.com/deep-over/FiLM
下载链接
链接失效反馈官方服务:
资源简介:
FiLM数据集是由汉阳大学应用人工智能系创建,包含10个子数据集,总计24亿条数据。该数据集涵盖了金融领域的多种文档类型,如新闻、SEC文件、盈利电话会议记录、学术论文等。创建过程中,研究团队从多个来源收集数据,并进行了详细的预处理,包括清洗和去重。FiLM数据集主要用于训练和评估金融预训练语言模型,旨在提高模型在金融领域的泛化能力和性能,解决金融数据分析中的关键问题。
The FiLM dataset was developed by the Department of Applied Artificial Intelligence at Hanyang University. It consists of 10 sub-datasets with a total of 2.4 billion data entries. This dataset covers diverse document types in the financial domain, including news articles, SEC filings, earnings conference call transcripts, academic papers, and more. During its creation, the research team collected data from multiple sources and conducted detailed preprocessing including data cleaning and deduplication. The FiLM dataset is primarily used for training and evaluating financial pre-trained language models, aiming to improve the generalization ability and performance of models in the financial field and solve key issues in financial data analysis.
提供机构:
汉阳大学应用人工智能系
创建时间:
2023-10-20



