tanaos/synthetic-topic-classification-dataset-v1
收藏Hugging Face2025-12-26 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/tanaos/synthetic-topic-classification-dataset-v1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是由Tanaos使用Artifex Python库合成的,旨在训练和评估主题分类模型。数据集包含带有标签的文本样本,每个样本由一个句子或段落及其对应的主题类别组成。主题类别包括政治、健康、技术、娱乐、金融、关系、教育、工作、科学、社会文化、游戏、生活方式、体育、汽车和其他。数据集适用于训练、微调和评估主题分类模型,常见用途包括开发模型以将文本分类到预定义的主题或类别中,评估主题分类系统的性能,以及研究提高文本分类准确性的技术。
This dataset was created synthetically by Tanaos with the Artifex Python library, designed to train and evaluate Topic Classification models. The dataset contains text samples labeled with their corresponding topics, each consisting of a sentence or paragraph along with a label indicating its topic category. Topics include politics, health, technology, entertainment, money_finance, relationships_dating, education_learning, work_careers, science, society_culture, gaming, lifestyle_hobbies, sports, automotive, and other. The dataset is intended for training, fine-tuning, and evaluating Topic Classification models, with common use cases including developing models to classify text into predefined topics, benchmarking topic classification systems, and researching techniques to improve text classification accuracy.
提供机构:
tanaos



