five

MatthiasPicard/Frugal-AI-Train-Data-88k

收藏
Hugging Face2025-01-31 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/MatthiasPicard/Frugal-AI-Train-Data-88k
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个用于训练参加Frugal AI挑战文本部分比赛的模型的合成和真实数据组合的数据集。数据集包含了来自各种来源的样本,包括约75k个用于所有标签的合成引用和约10k个来自实际来源的重新标记的额外示例。这些数据是从DESMOG、CARDS数据集以及 Skeptical Science网站抓取的。合成数据是使用多种模型生成的,包括Gemini 2.0 Flash Thinking、Claude 3.5 Sonnet、GPT-4o等。数据集的目的是通过结合真实和合成数据来提高对气候相关话题的开源研究,以及改进对人工生成数据的检测研究。

This is a combined dataset of synthetic and real data for training models submitted for the text part of the Frugal AI challenge competition. The dataset contains samples from various sources, including around 75k synthetic quotes for all labels and around 10k additional examples relabelled from real sources such as DESMOG, the CARDS dataset, and the Skeptical Science website. The synthetic data was generated using various models including Gemini 2.0 Flash Thinking, Claude 3.5 Sonnet, GPT-4o, etc. The purpose of the dataset is to improve open research on climate-related topics by combining real and synthetic data, as well as to enhance research on the detection of artificially generated data.
提供机构:
MatthiasPicard
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作