davanstrien/imdb-classify-augment-v6-trackio
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/davanstrien/imdb-classify-augment-v6-trackio
下载链接
链接失效反馈官方服务:
资源简介:
这是一个由LLM标注的IMDB电影评论分类数据集,使用classify-and-augment工具生成。数据集采用HuggingFaceTB/SmolLM3-3B模型进行标注,包含positive和negative两个情感标签。原始输入50行数据,输出60行数据(包含10行合成数据)。标签分布均衡,真实数据中positive 26条,negative 24条;合成数据中positive 4条,negative 6条。合成数据经过模型自一致性验证,接受率为50%。
An LLM-annotated IMDB movie review classification dataset produced by classify-and-augment. The dataset was annotated using HuggingFaceTB/SmolLM3-3B model with positive and negative sentiment labels. It contains 50 input rows and outputs 60 rows (including 10 synthetic rows). The label distribution is balanced with 26 real positive, 24 real negative, 4 synthetic positive and 6 synthetic negative examples. Synthetic data underwent model self-consistency validation with 50% acceptance rate.
提供机构:
davanstrien



