InstrucatQA

Name: InstrucatQA
Creator: maas
Published: 2025-12-04 10:23:44
License: 暂无描述

魔搭社区2025-12-04 更新2025-02-01 收录

下载链接：

https://modelscope.cn/datasets/BSC-LT/InstrucatQA

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Dataset Name Instructional dataset to finetune models used for RAG applications ## Dataset Details ### Dataset Description This dataset is a merge from QA instructions from InstruCAT (ca), SQUAC (es), SQUAD (en), plus generalists CA and ES MENTOR datasets to provide a cognitive background for generating responses. Contains splits of 66139 (train) and 11674 (validation) instructions - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** ca, es, en - **License:** [More Information Needed] ### Dataset Sources [optional]  - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses Experiments with Catalan RAG applications ### Direct Use  [More Information Needed] ### Out-of-Scope Use  [More Information Needed] ## Dataset Structure  [More Information Needed] ## Dataset Creation ### Curation Rationale  [More Information Needed] ### Source Data  #### Data Collection and Processing  [More Information Needed] #### Who are the source data producers?  [More Information Needed] ### Annotations [optional]  #### Annotation process  [More Information Needed] #### Who are the annotators?  [More Information Needed] #### Personal and Sensitive Information  [More Information Needed] ## Bias, Risks, and Limitations  [More Information Needed] ### Recommendations  Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional]  **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional]  [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]

# 数据集卡片：数据集名称本数据集为用于微调检索增强生成（Retrieval-Augmented Generation, RAG）应用模型的指令数据集。 ## 数据集详情 ### 数据集概述本数据集合并了来自InstruCAT（加泰罗尼亚语）、SQUAC（西班牙语）、SQUAD（英语）的问答指令，以及通用型加泰罗尼亚语与西班牙语MENTOR数据集，以提供生成回复所需的认知背景。数据集包含训练集（66139条）与验证集（11674条）两类指令拆分。 - **数据整理方：** [需补充更多信息] - **资助方（可选）：** [需补充更多信息] - **共享方（可选）：** [需补充更多信息] - **自然语言处理覆盖语言：** 加泰罗尼亚语（ca）、西班牙语（es）、英语（en） - **授权协议：** [需补充更多信息] ## 数据集来源（可选）  - **代码仓库：** [需补充更多信息] - **相关论文（可选）：** [需补充更多信息] - **演示示例（可选）：** [需补充更多信息] ## 数据集用途用于加泰罗尼亚语RAG应用的相关实验。 ### 直接适用场景  [需补充更多信息] ### 超出适用范围的用途  [需补充更多信息] ## 数据集结构  [需补充更多信息] ## 数据集构建 ### 数据整理依据  [需补充更多信息] ### 源数据  #### 数据收集与处理流程  [需补充更多信息] #### 源数据的创建者是谁？  [需补充更多信息] ### 标注信息（可选）  #### 标注流程  [需补充更多信息] #### 标注者是谁？  [需补充更多信息] #### 个人与敏感信息  [需补充更多信息] ## 偏差、风险与局限性  [需补充更多信息] ### 相关建议  用户应知晓本数据集存在的风险、偏差与局限性。如需获取进一步的相关建议，需补充更多信息。 ## 引用信息（可选）  **BibTeX格式：** [需补充更多信息] **APA格式：** [需补充更多信息] ## 术语表（可选）  [需补充更多信息] ## 更多信息（可选） [需补充更多信息] ## 数据集卡片撰写者（可选） [需补充更多信息] ## 数据集卡片联系方式 [需补充更多信息]

提供机构：

maas

创建时间：

2025-01-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集