最全中文诗歌古典文集数据库|中文诗歌数据集|预训练数据集数据集
收藏库帕思2025-12-22 更新2025-12-27 收录
下载链接:
https://www.kupasai.com/corpus/detail?id=622&type=1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是目前最全的中华古典文集数据库,收录5.5万首唐诗、26万首宋词及2.1万首其他古典诗词,涵盖唐宋时期近1.4万名诗人和1500名两宋词人。数据规模大、作者信息全,文本来源权威,具有高度的完整性与学术价值。适用于古典文学研究、诗词生成、自然语言处理、文化数据分析等场景,支持语言模型训练与人文计算应用。
This dataset is the most comprehensive Chinese classical literary corpus database available to date. It contains 55,000 Tang poems, 260,000 Song ci (lyric poems), and 21,000 other classical Chinese poems, covering nearly 14,000 poets from the Tang and Song dynasties as well as 1,500 ci poets of the Northern and Southern Song dynasties. With large-scale data, complete author information, authoritative text sources, and high integrity, it holds considerable academic value. This resource finds applications in fields including classical literary studies, poetry generation, natural language processing, cultural data analysis, and more, and supports language model training and humanities computing applications.
提供机构:
库帕思
创建时间:
2025-12-18



