Open Mind Common Sense (OMCS) corpus

Name: Open Mind Common Sense (OMCS) corpus
Creator: Crowd-sourced contributors
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/commonsense/conceptnet5/wiki/Downloads

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为OMCS，是一个众包的常识语句知识数据库，包含了来自超过15000名贡献者的百万条句子。该OMCS数据集被用于在CosmosQA数据集上进行微调之前，对文本证据生成器进行预训练。其规模达到百万条句子，任务是对文本证据生成器进行预训练。

This dataset, named OMCS, is a crowdsourced commonsense sentence knowledge database containing millions of sentences from over 15,000 contributors. The OMCS dataset is utilized to pre-train textual evidence generators prior to their fine-tuning on the CosmosQA dataset. With a corpus of millions of sentences, its core task is to pre-train textual evidence generators.

提供机构：

Crowd-sourced contributors

搜集汇总

数据集介绍

背景与挑战

背景概述

Open Mind Common Sense (OMCS) corpus 是一个常识知识数据集，用于构建ConceptNet知识图谱。它包含通过众包收集的原始句子（如自由文本和模板化响应），以及结构化的断言数据，描述概念间关系（如反义词）。数据来源多样，但可能不保证真实性或适用性，主要用于自然语言处理和人工智能研究。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集