专业百科全书语料
收藏北京国际大数据交易所2024-06-15 收录
下载链接:
https://webs.bjidex.com/sys-bsc-home/#/bscConsole/tradingMarket/detail?id=2070
下载链接
链接失效反馈官方服务:
资源简介:
语料汇集了百科社各种科学专业的内容,包括 《中国大百科全书》一二三版、《中国军事百科全书》、近百 种地方百科全书和行业百科全书在内的上千万的知识点、近百 万的专业百科条目内容,以及中国社会科学词条库、各类字词 典、学生论文、科普文章等专业知识内容。通过对其进行精细 化的处理和标注,生成了高质量的语料库集合,可为大模型的 训练和优化提供有力支持。
This corpus encompasses professional scientific content from multiple encyclopedia publishing institutions. It includes tens of millions of knowledge points and nearly one million professional encyclopedia entries sourced from the *Encyclopedia of China* (1st, 2nd and 3rd Editions), the *Encyclopedia of China Military Sciences*, nearly one hundred local and trade encyclopedias, as well as professional knowledge content from the Chinese Social Sciences Terminology Database, various dictionaries and glossaries, student academic papers, and popular science articles. After undergoing refined processing and annotation, this corpus has been developed into a high-quality collection that can provide robust support for the training and optimization of Large Language Models (LLMs).
提供机构:
北京百科在线网络出版有限公司
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



