RuQTopics
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/its5Q/yandex-q/blob/main/full.jsonl.gz
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为RuQTopics,是一个专为话题分类设计的俄语主题数据集。它包含了大量的单标签和多标签问题-答案对,覆盖了76个类别。该数据集不仅包括单标签例子,也包含多标签例子,其中大部分问题较短,而答案相对较长。这种结构使得该数据集非常适合于现实世界的对话任务。具体规模上,数据集包含了361,650个单标签问题和170,930个多标签问题。其主要的任务是话题分类。
The dataset, named RuQTopics, is a Russian-language dataset specifically designed for topic classification. It contains a large number of single-label and multi-label question-answer pairs spanning 76 categories. The dataset includes both single-label and multi-label examples, where most questions are relatively short while the corresponding answers are comparatively lengthy. This structure makes the dataset highly suitable for real-world conversational tasks. In terms of scale, the dataset comprises 361,650 single-label question-answer pairs and 170,930 multi-label question-answer pairs. Its primary task is topic classification.
提供机构:
Yandex



