kotobase-dict
收藏Hugging Face2026-03-23 更新2026-03-24 收录
下载链接:
https://huggingface.co/datasets/4pplet/kotobase-dict
下载链接
链接失效反馈官方服务:
资源简介:
Kotobase Dictionary 是一个预构建的 SQLite 字典数据库,专为 Kotobase 应用设计。该应用是一个离线优先的韩语/日语/汉语词典和词汇学习工具。数据集包含超过 400 万条词典条目,涵盖韩语、日语和汉语,以及超过 480 万条多语言(英语、韩语、日语等)的释义。此外,数据集还提供来自 Tatoeba 的例句、TOPIK 级别标签(A/B/C)用于韩语词汇、来自 Leipzig 语料库数据的频率排名,并支持通过 FTS5 进行全文搜索。数据来源于多个开放源代码/开放数据的词典,包括 KRDict、JMDict、CEDICT、Wiktionary/Kaikki、Tatoeba 和 OpenDict,所有数据均遵循 CC-BY-SA-4.0 许可。该数据集适用于语言学习、词汇查询和多语言词典构建等应用场景。
Kotobase Dictionary is a pre-built SQLite dictionary database purpose-built for the Kotobase application, an offline-first Korean, Japanese, and Chinese dictionary and vocabulary learning tool. This dataset contains over 4 million dictionary entries spanning Korean, Japanese, and Chinese, alongside more than 4.8 million multilingual definitions in languages including English, Korean, Japanese, and others. In addition, the dataset provides example sentences sourced from Tatoeba, TOPIK proficiency level tags (A/B/C) for Korean vocabulary entries, frequency rankings derived from Leipzig Corpora datasets, and supports full-text search via FTS5. The dataset’s content is sourced from multiple open-source and open-data dictionaries including KRDict, JMDict, CEDICT, Wiktionary/Kaikki, Tatoeba, and OpenDict, with all materials licensed under CC-BY-SA-4.0. This dataset is suitable for use cases such as language learning, vocabulary querying, and multilingual dictionary construction.
创建时间:
2026-03-20



