Intel/misinformation-guard
收藏Hugging Face2025-07-31 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/Intel/misinformation-guard
下载链接
链接失效反馈官方服务:
资源简介:
MisInformation Guard是一个合成的文本分类数据集,用于训练和评估针对虚假信息的文本分类模型。数据集通过自定义的管道使用Llama 3.1 8B和Mixtral 8x7B两种LLM模型生成。数据集分为训练和验证集(约33,000个样本)以及测试集(约8,000个样本)。每个样本包含生成的文本、生成文本的理由、分类标签和生成模型的标识。分类标签分为`false`(完全虚假信息)、`partially true`(包含一些真实信息但具有误导性或缺乏重要上下文)、`mostly true`(大致准确但可能有轻微的不准确或遗漏)和`true`(完全准确和事实信息)。
MisInformation Guard is a synthetic text classification dataset designed to train and evaluate models for text classification against misinformation. The dataset was generated using a custom pipeline with Llama 3.1 8B and Mixtral 8x7B LLMs. It consists of a training and validation set (approximately 33,000 samples) and a test set (approximately 8,000 samples). Each sample includes the generated text, the reasoning for generating the text, the classification label, and the identifier of the generating model. The classification labels are `false` (completely untrue or fabricated information), `partially true` (contains some truth but is misleading or lacks important context), `mostly true` (largely accurate but may have minor inaccuracies or omissions), and `true` (entirely accurate and factual information).
提供机构:
Intel



