davanstrien/imdb-classify-augment-v4-prompt-fix
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/davanstrien/imdb-classify-augment-v4-prompt-fix
下载链接
链接失效反馈官方服务:
资源简介:
这是一个由LLM标注的IMDB分类增强数据集,使用classify-and-augment工具生成。数据集采用HuggingFaceTB/SmolLM3-3B模型进行标注,包含positive和negative两类情感标签。原始输入20条数据,通过增强生成30条数据。标签分布均衡,正负标签各15条(真实数据11负/9正,合成数据4负/6正)。合成数据经过自我一致性验证,接受率为50%。
An LLM-annotated IMDB classification augmentation dataset produced by classify-and-augment. The dataset was annotated using HuggingFaceTB/SmolLM3-3B model with positive and negative sentiment labels. Original 20 input rows were augmented to 30 output rows. Balanced label distribution with 15 each (11 real negative/9 real positive, 4 synthetic negative/6 synthetic positive). Synthetic data underwent self-consistency check with 50% acceptance rate.
提供机构:
davanstrien



