waraml/multisv_dataset
收藏Hugging Face2024-10-11 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/waraml/multisv_dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个混合了RAG微调数据和一般指令微调数据的数据集,用于瑞典语和越南语的问答任务。RAG数据是通过特定管道在ScandiQA、ViCoQA、ViNewsQA、ViQuadQA和ViWikiQA等数据集的训练集上生成的,而一般指令数据来源于开源英文数据集和ViGPT论文。数据集包含prompt、context、response、type和language等特征,适用于问答任务。
This dataset is a mixture of RAG finetuning data and general instruction finetuning data for Swedish and Vietnamese question-answering tasks. The RAG data is generated using a proposed pipeline on the training sets of ScandiQA, ViCoQA, ViNewsQA, ViQuadQA, and ViWikiQA, while the general instruction data comes from open-source English datasets and the ViGPT paper. The dataset includes features such as prompt, context, response, type, and language, and is suitable for question-answering tasks.
提供机构:
waraml



