primordic/WordDefinitionsLite
收藏Hugging Face2025-07-23 更新2025-11-29 收录
下载链接:
https://hf-mirror.com/datasets/primordic/WordDefinitionsLite
下载链接
链接失效反馈官方服务:
资源简介:
WordDefinitionsLite是一个包含20万个最常见网络用词及其定义的大型数据集。这个数据集结合了传统词典和经过人工验证的语言模型生成的内容。数据来源于开放词典以及使用LLM模型合成的数据,特别是对于传统词典中未收录的词汇非常有用。经过广泛的 manual validation 确保了合成结果的质量。数据集可用于文本嵌入、教育内容和填字游戏等多种用途。
WordDefinitionsLite is a large dataset containing definitions for 200K of the most commonly used words on the internet. It combines traditional dictionaries with manually-validated LLM-generated content. The dataset sources from open dictionaries as well as synthetic generation using LLMs, which is especially useful for words not found in traditional dictionaries. An extensive manual validation process ensures the quality of these synthetic results. The dataset can be used for various purposes such as text embeddings, educational content, and word puzzles/crosswords.
提供机构:
primordic



