MatthiasPicard/Frugal-AI-Train-Data-88k
收藏Hugging Face2025-01-31 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/MatthiasPicard/Frugal-AI-Train-Data-88k
下载链接
链接失效反馈官方服务:
资源简介:
这是一个用于训练参加Frugal AI挑战文本部分比赛的模型的合成和真实数据组合的数据集。数据集包含了来自各种来源的样本,包括约75k个用于所有标签的合成引用和约10k个来自实际来源的重新标记的额外示例。这些数据是从DESMOG、CARDS数据集以及 Skeptical Science网站抓取的。合成数据是使用多种模型生成的,包括Gemini 2.0 Flash Thinking、Claude 3.5 Sonnet、GPT-4o等。数据集的目的是通过结合真实和合成数据来提高对气候相关话题的开源研究,以及改进对人工生成数据的检测研究。
This is a combined dataset of synthetic and real data for training models submitted for the text part of the Frugal AI challenge competition. The dataset contains samples from various sources, including around 75k synthetic quotes for all labels and around 10k additional examples relabelled from real sources such as DESMOG, the CARDS dataset, and the Skeptical Science website. The synthetic data was generated using various models including Gemini 2.0 Flash Thinking, Claude 3.5 Sonnet, GPT-4o, etc. The purpose of the dataset is to improve open research on climate-related topics by combining real and synthetic data, as well as to enhance research on the detection of artificially generated data.
提供机构:
MatthiasPicard



