five

四库全书别集类智慧古籍平台数据

收藏
浙江省数据知识产权登记平台2024-12-05 更新2024-12-06 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/96627
下载链接
链接失效反馈
官方服务:
资源简介:
根据知识图谱理念,综合运用大数据的计量统计、定位查询、聚类查询、空间分析、数据关联、网络分析、机器标引、众筹众包等技术,将中国古典文献和研究成果图谱化、智能化,从而打造集浏览、查询、研究、欣赏于一体,熔审美阅读、知识学习、场景体验于一炉的古籍智慧大数据平台。为读者扫除古代文献阅读障碍,推动古籍阅读普及化,打造古籍阅读、整理和研究的新范式。同时激活学者的研究成果,突破学术圈的壁垒,将前沿的学术研究成果转化为社会大众共享的文化资源。1.综合运用大数据的计量统计、定位查询、聚类查询、空间分析、数据关联、网络分析、机器标引、众筹众包等技术,将中国古典文献和研究成果图谱化、智能化;2.利用OCR即“光学字符识别”技术,将图像中的文字转换成文本格式。收集整理四库全书存本内容,转换成文本数据;3.智能标引:利用了结构化的地名、人名、职官、词典等数据库对上传的文本进行批量标引,通过不重复编号代码,利用X,Y,ID1,ID2四项标识进行文本数据编码,使文本与后台的数据产生了关联。读者在平台点击已标引的字词即可查看释义;4.空间分析技术:借助ArcGIS、QGIS等地理信息系统软件,结合在线地理信息系统,收录著者名,朝代,地区,省会等相关信息,使古籍中留存的地理信息可视化;5.智慧古籍平台与学术地图发布平台相连接,收录部,集,标题,摘要,出处,负责人等数据,点击著者详情,即可查看所连接的人物行迹图。该技术亦应用于古籍文本地点释义功能中。上传到“智慧古籍平台”的文献资料将经过OCR识别及校对、目录提取、机器标点、标点校对及专家审核、机器标引、标引校对及专家审核,在前台发布。完成智慧古籍平台的数据库建设。

Based on the concept of knowledge graph, this project comprehensively applies big data technologies including quantitative statistics, location query, cluster query, spatial analysis, data association, network analysis, machine indexing and crowdsourcing, to map and intellectualize Chinese classical literature and research achievements, thereby building a smart big data platform for ancient Chinese books that integrates browsing, querying, research and appreciation, and combines aesthetic reading, knowledge learning and scenario experience. It aims to remove reading obstacles for readers, popularize the reading of ancient Chinese books, and create a new paradigm for the reading, collation and research of ancient books. Meanwhile, it activates scholars' research achievements, breaks through academic barriers, and transforms cutting-edge academic research outcomes into cultural resources shared by the general public. 1. Comprehensively applying big data technologies including quantitative statistics, location query, cluster query, spatial analysis, data association, network analysis, machine indexing and crowdsourcing, to map and intellectualize Chinese classical literature and research achievements; 2. Adopting Optical Character Recognition (OCR) technology to convert text in images into text format, and collecting and organizing the content of extant editions of the Complete Library of the Four Treasuries to convert them into text data; 3. Intelligent Indexing: Using structured databases of place names, personal names, official positions, dictionaries and other resources to perform batch indexing on uploaded texts. By adopting unique numbering codes and four identifiers including X, Y, ID1 and ID2, text data is encoded to establish associations between texts and backend data. Readers can view the explanations by clicking on the indexed words on the platform; 4. Spatial Analysis Technology: Leveraging geographic information system (GIS) software such as ArcGIS and QGIS, combined with online GIS, to collect relevant information including author names, dynasties, regions and provincial capitals, so as to visualize the geographical information preserved in ancient books; 5. The Smart Ancient Books Platform is connected to the Academic Map Publishing Platform, collecting data including classification, collection, title, abstract, source, person-in-charge and other information. Clicking on the author details allows users to view the connected travel route maps of figures. This technology is also applied to the place explanation function of ancient book texts. The literature materials uploaded to the "Smart Ancient Books Platform" will undergo OCR recognition and proofreading, directory extraction, automatic punctuation, punctuation proofreading and expert review, machine indexing, indexing proofreading and expert review before being published on the front-end, thus completing the database construction of the Smart Ancient Books Platform.
提供机构:
杭州云四库科技有限公司
创建时间:
2024-10-08
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务