gpt4o-arena-brevity-dpo|对话生成数据集|自然语言处理数据集
收藏数据集概述
语言
- 英语(en)
数据集信息
特征
- question-id: 字符串类型
- prompt: 字符串类型
- chosen: 字符串类型
- rejected: 字符串类型
数据分割
- train:
- 字节数: 18627876.9
- 样本数: 22941
- test:
- 字节数: 2069764.1
- 样本数: 2549
数据大小
- 下载大小: 14670524
- 数据集大小: 20697641.0
配置
- config_name: default
- data_files:
- train: data/train-*
- test: data/test-*
- data_files:
数据集来源
- 该数据集是通过OpenAI的GPT-4o模型生成的,基于lmsys/chatbot_arena_conversations数据集。
生成工具
- 使用ShortGPT项目生成。

poi
本项目收集国内POI兴趣点,当前版本数据来自于openstreetmap。
github 收录
FER2013
FER2013数据集是一个广泛用于面部表情识别领域的数据集,包含28,709个训练样本和7,178个测试样本。图像属性为48x48像素,标签包括愤怒、厌恶、恐惧、快乐、悲伤、惊讶和中性。
github 收录
Tropicos
Tropicos是一个全球植物名称数据库,包含超过130万种植物的名称、分类信息、分布数据、图像和参考文献。该数据库由密苏里植物园维护,旨在为植物学家、生态学家和相关领域的研究人员提供全面的植物信息。
www.tropicos.org 收录
Breast Ultrasound Images (BUSI)
小型(约500×500像素)超声图像,适用于良性和恶性病变的分类和分割任务。
github 收录
全国 1∶200 000 数字地质图(公开版)空间数据库
As the only one of its kind, China National Digital Geological Map (Public Version at 1∶200 000 scale) Spatial Database (CNDGM-PVSD) is based on China' s former nationwide measured results of regional geological survey at 1∶200 000 scale, and is also one of the nationwide basic geosciences spatial databases jointly accomplished by multiple organizations of China. Spatially, it embraces 1 163 geological map-sheets (at scale 1: 200 000) in both formats of MapGIS and ArcGIS, covering 72% of China's whole territory with a total data volume of 90 GB. Its main sources is from 1∶200 000 regional geological survey reports, geological maps, and mineral resources maps with an original time span from mid-1950s to early 1990s. Approved by the State's related agencies, it meets all the related technical qualification requirements and standards issued by China Geological Survey in data integrity, logic consistency, location acc racy, attribution fineness, and collation precision, and is hence of excellent and reliable quality. The CNDGM-PVSD is an important component of China' s national spatial database categories, serving as a spatial digital platform for the information construction of the State's national economy, and providing informationbackbones to the national and provincial economic planning, geohazard monitoring, geological survey, mineral resources exploration as well as macro decision-making.
DataCite Commons 收录