ConceptNet 5.x Raw Data
收藏Mendeley Data2024-03-27 更新2024-06-27 收录
下载链接:
https://zenodo.org/record/3739540
下载链接
链接失效反馈官方服务:
资源简介:
This archive contains the raw data that ConceptNet 5 is built from. More information about ConceptNet is available at http://conceptnet.io. If you use ConceptNet as part of another work, you must attribute ConceptNet and you must not restrict its license terms. For more license information: https://creativecommons.org/licenses/by-sa/4.0/ ConceptNet has been developed by: * The MIT Media Lab, through various groups at different times: - Commonsense Computing - Software Agents - Digital Intuition * The Commonsense Computing Initiative, a worldwide collaboration with contributions from: - National Taiwan University - Universidade Federal de São Carlos - Hokkaido University - Tilburg University - Nihon Unisys Labs - Dentsu Inc. - Kyoto University - Yahoo Research Japan * Luminoso Technologies, Inc. Significant amounts of data were imported from: * WordNet, a project of Princeton University * Wikipedia and Wiktionary, collaborative projects of the Wikimedia Foundation * Luis von Ahn's "Games with a Purpose" * DBPedia * OpenCyc * JMDict, by Jim Breen ConceptNet also takes input from these sources of distributional word embeddings: ConceptNet takes input from these sources of pre-computed distributional word embeddings: - GloVe: Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. https://nlp.stanford.edu/projects/glove/ - word2vec: Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In Computing Research Repository. http://dblp.org/rec/bib/journals/corr/abs-1301-3781 - fastText: Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. http://fasttext.cc Here is a short, incomplete list of people who have made significant contributions to the development of ConceptNet as a data resource, roughly in order of appearance: * Push Singh * Catherine Havasi * Hugo Liu * Hyemin Chung * Robyn Speer * Ken Arnold * Yen-Ling Kuo * Naoki Otani
本归档文件包含构建概念网络5(ConceptNet 5)所用的原始数据。有关概念网络(ConceptNet)的更多信息,请访问http://conceptnet.io。若您在其他研究成果中使用概念网络,必须对概念网络予以署名,且不得限制其许可条款。更多许可信息请查阅:https://creativecommons.org/licenses/by-sa/4.0/
概念网络的开发方包括:
* 麻省理工学院媒体实验室(MIT Media Lab),曾由不同时期的多个团队参与:
- 常识计算组(Commonsense Computing)
- 软件智能体组(Software Agents)
- 数字直觉组(Digital Intuition)
* 常识计算倡议(Commonsense Computing Initiative),这是一项全球性协作项目,贡献方包括:
- 台湾大学
- 巴西联邦圣保罗大学(Universidade Federal de São Carlos)
- 北海道大学(Hokkaido University)
- 蒂尔堡大学(Tilburg University)
- 日本Nihon Unisys实验室
- 电通公司(Dentsu Inc.)
- 京都大学(Kyoto University)
- 雅虎日本研究院(Yahoo Research Japan)
* 鲁米诺索科技有限公司(Luminoso Technologies, Inc.)
本数据集导入了大量来自以下项目的数据:
* 普林斯顿大学研发的词网(WordNet)
* 维基百科(Wikipedia)与维基词典(Wiktionary),均为维基媒体基金会(Wikimedia Foundation)的协作项目
* 路易斯·冯·安(Luis von Ahn)的「有目的的游戏」(Games with a Purpose)
* DBpedia
* OpenCyc
* 吉姆·布林(Jim Breen)研发的JMDict
概念网络同时接入了以下预计算分布词嵌入的数据源:
- GloVe:Jeffrey Pennington、Richard Socher与Christopher D. Manning,2014年。《GloVe:用于词表征的全局向量》。https://nlp.stanford.edu/projects/glove/
- word2vec:Tomas Mikolov、Kai Chen、Greg Corrado与Jeffrey Dean,2013年。《高效估计向量空间中的词表征》,载于《计算机研究汇刊》(Computing Research Repository)。http://dblp.org/rec/bib/journals/corr/abs-1301-3781
- fastText:Piotr Bojanowski、Edouard Grave、Armand Joulin与Tomas Mikolov,2016年。《利用子词信息丰富词向量》。http://fasttext.cc
以下是为概念网络作为数据资源的开发做出重大贡献的人员的简短不完全名单,大致按参与时间排序:
* 普什·辛格(Push Singh)
* 凯瑟琳·哈瓦西(Catherine Havasi)
* 雨果·刘(Hugo Liu)
* 郑惠敏(Hyemin Chung)
* 罗宾·斯皮尔(Robyn Speer)
* 肯·阿诺德(Ken Arnold)
* 郭彦伶(Yen-Ling Kuo)
* 大谷直树(Naoki Otani)
创建时间:
2023-06-28



