智慧古籍平台总集类数据
收藏浙江省数据知识产权登记平台2025-04-09 更新2025-04-10 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/122669
下载链接
链接失效反馈官方服务:
资源简介:
依托自研平台“智慧古籍”内总集类书籍数据,根据知识图谱理念,综合运用大数据的计量统计、定位查询、聚类查询、空间分析、数据关联、网络分析、机器标引、众筹众包等技术,将中国古典文献和研究成果图谱化、智能化,从而打造集浏览、查询、研究、欣赏于一体,熔审美阅读、知识学习、场景体验于一炉的古籍智慧大数据平台。为读者扫除古代文献阅读障碍,推动古籍阅读普及化,打造古籍阅读、整理和研究的新范式。同时激活学者的研究成果,突破学术圈的壁垒,将前沿的学术研究成果转化为社会大众共享的文化资源。1.综合运用大数据的计量统计、根据总集类数据定位查询、聚类查询、空间分析、数据关联、网络分析、机器标引、众筹众包等技术,将中国古典文献和研究成果图谱化、智能化;2.利用OCR即“光学字符识别”技术,将图像中的文字转换成文本格式。收集整理四库全书存本内容,转换成文本数据;3.智能标引:利用了结构化的地名、人名、职官、词典等数据库对上传的文本进行批量标引,通过不重复编号代码,利用X,Y,ID1,ID2四项标识进行文本数据编码,使文本与后台的数据产生了关联。读者在平台点击已标引的字词即可查看释义;4.空间分析技术:借助ArcGIS、QGIS等地理信息系统软件,结合在线地理信息系统,收录著者名,朝代,地区,省会等相关信息,使古籍中留存的地理信息可视化;5.智慧古籍平台与学术地图发布平台相连接,收录部,集,标题,摘要,出处,负责人等数据,点击著者详情,即可查看所连接的人物行迹图。该技术亦应用于古籍文本地点释义功能中。上传到“智慧古籍平台”的文献资料将经过OCR识别及校对、目录提取、机器标点、标点校对及专家审核、机器标引、标引校对及专家审核,在前台发布。完成智慧古籍平台的数据库建设。
Relying on the collected works data within the self-developed "Wisdom Ancient Books" platform, based on the concept of knowledge graph, this project comprehensively applies big data technologies including quantitative statistics, location query, clustering query, spatial analysis, data association, network analysis, machine indexing and crowdsourcing to transform Chinese classical literature and research achievements into graph-based and intelligent forms, thereby building an intelligent big data platform for ancient books that integrates browsing, querying, research and appreciation, and combines aesthetic reading, knowledge learning and scenario experience. This platform removes reading barriers for readers of ancient documents, promotes the popularization of ancient book reading, and creates a new paradigm for ancient book reading, collation and research. Meanwhile, it activates scholars' research achievements, breaks through the barriers in academic circles, and transforms cutting-edge academic research results into cultural resources shared by the general public.
1. Comprehensively applying technologies such as big data-based quantitative statistics, location query, clustering query, spatial analysis, data association, network analysis, machine indexing and crowdsourcing based on collected works data to transform Chinese classical literature and research achievements into graph-based and intelligent forms;
2. Using OCR (Optical Character Recognition) technology to convert text in images into text format. Collect and organize the surviving versions of the Complete Library in the Four Branches of Literature and convert them into text data;
3. Intelligent indexing: Using structured databases of place names, personal names, official positions, dictionaries, etc. to conduct batch indexing on uploaded texts. Using non-repeating numbered codes and four identifiers including X, Y, ID1 and ID2 to encode text data, so as to establish association between texts and background data. Readers can view the definitions by clicking on the indexed words and phrases on the platform;
4. Spatial analysis technology: With the help of geographic information system software such as ArcGIS and QGIS, combined with online geographic information systems, collecting relevant information including author names, dynasties, regions and provincial capitals, so as to visualize the geographic information preserved in ancient books;
5. The "Wisdom Ancient Books" platform is connected to the academic map publishing platform, collecting data including divisions, collections, titles, abstracts, sources and persons-in-charge. Clicking on the author details will display the connected character itinerary map. This technology is also applied to the place definition function of ancient book texts. The literature materials uploaded to the "Wisdom Ancient Books" platform will undergo OCR recognition and proofreading, directory extraction, machine punctuation, punctuation proofreading and expert review, machine indexing, indexing proofreading and expert review before being released on the front end, thus completing the database construction of the Wisdom Ancient Books platform.
提供机构:
杭州云四库科技有限公司
创建时间:
2025-01-09
搜集汇总
数据集介绍

背景与挑战
背景概述
智慧古籍平台总集类数据是由杭州云四库科技有限公司自行产生的企业数据,包含593条记录,格式为xlsx。该数据集依托智慧古籍平台,利用大数据和知识图谱技术,将中国古典文献和研究成果图谱化、智能化,旨在推动古籍阅读普及化和研究新范式。
以上内容由遇见数据集搜集并总结生成



