Pirá 2.0
收藏arXiv2023-09-20 更新2024-06-21 收录
下载链接:
https://github.com/C4AI/Pira
下载链接
链接失效反馈官方服务:
资源简介:
Pirá 2.0是一个专注于海洋、巴西海岸和气候变化的阅读理解数据集,由圣保罗大学创建。该数据集包含2258个问题答案对,数据来源于科学文献摘要和报告。创建过程中,研究团队通过收集文本、生成问题和答案、编辑和评估等步骤构建了数据集。Pirá 2.0特别适用于测试机器学习模型获取专业科学知识的能力,广泛应用于自然语言处理和问答系统中,旨在解决复杂科学领域的信息检索和理解问题。
Pirá 2.0 is a reading comprehension dataset focused on the ocean, Brazilian coasts and climate change, developed by the University of São Paulo. It contains 2,258 question-answer pairs, with data sourced from scientific literature abstracts and reports. During its construction, the research team built the dataset through procedures including text collection, question and answer generation, editing and evaluation. Pirá 2.0 is specifically tailored to evaluate the capability of machine learning models to acquire specialized scientific knowledge, and is widely employed in natural language processing and question answering systems, aiming to address information retrieval and understanding problems in complex scientific domains.
提供机构:
圣保罗大学
创建时间:
2023-09-20



