plosone_minimal_dataset.csv
收藏Figshare2025-06-17 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/plosone_minimal_dataset_csv/29335268/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the results of a novel, LLM-driven annotation process applied to expert coffee reviews, as presented in the manuscript "Automated Multi-Label Coffee Flavor Classification: A Comparative Study of BERT and TF-IDF using LLM-Driven Data Annotation."This "Minimal Dataset" is provided to ensure the reproducibility of our findings. It includes:A unique <code>review_id</code> for each entry.The original <code>blind_assessment</code> text used as input for the LLM.The original quantitative sensory scores (<code>Final Score</code>, <code>Aroma</code>, <code>Acidity/Structure</code>, etc.) provided by human experts, which were used for the quantitative validation of the LLM's annotations.The final 17 columns of binary (0/1) flavor labels as generated by the LLM.
本数据集收录了一项将新型大语言模型(LLM)驱动的标注流程应用于专业咖啡品鉴评论的实验结果,相关研究成果已发表于论文《自动多标签咖啡风味分类:基于大语言模型驱动数据标注的BERT与TF-IDF对比研究(Automated Multi-Label Coffee Flavor Classification: A Comparative Study of BERT and TF-IDF using LLM-Driven Data Annotation)》。本「极简数据集(Minimal Dataset)」旨在保障研究结论的可复现性,其包含以下内容:1. 每条数据独有的<code>review_id</code>(评论ID);2. 用于输入大语言模型的原始盲评文本<code>blind_assessment</code>;3. 人类专家提供的原始感官定量评分,包括<code>Final Score</code>(最终得分)、<code>Aroma</code>(香气)、<code>Acidity/Structure</code>(酸度/风味结构)等,该评分用于对大语言模型生成的标注结果进行定量验证;4. 大语言模型生成的共17列二元(0/1)风味标签。
提供机构:
Jakkaew, Prasara
创建时间:
2025-06-17



