five

yassiracharki/Yahoo_Answers_10_categories_for_NLP

收藏
Hugging Face2024-07-26 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/yassiracharki/Yahoo_Answers_10_categories_for_NLP
下载链接
链接失效反馈
官方服务:
资源简介:
Yahoo Answers 10 categories for NLP数据集是基于Yahoo! Answers的10个主要类别构建的文本分类数据集。每个类别包含140,000个训练样本和6,000个测试样本,总计1,400,000个训练样本和60,000个测试样本。数据集仅使用了最佳答案内容和主要类别信息。文件classes.txt包含类别标签列表,train.csv和test.csv包含所有训练样本,格式为逗号分隔值,包含类别索引、问题标题、问题内容和最佳答案四个列。该数据集适用于细粒度文本分类任务。

The Yahoo! Answers topic classification dataset is constructed using 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000 and testing samples 60,000 in this dataset. The dataset files include classes.txt, train.csv, and test.csv, listing classes, training samples, and testing samples respectively. The dataset is primarily used for fine-grained text classification tasks.
提供机构:
yassiracharki
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作