Reubencf/multilingual-single-sentences
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Reubencf/multilingual-single-sentences
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个经过重新制作的版本,包含多种语言的单句完成文本,涵盖俄语、德语、西班牙语、韩语、中文、英语、法语和日语等。内容广泛,涉及历史修复、旅行物流、谚语和日常观察等多个主题。每条数据都是独立的文本完成,没有额外的上下文或配对提示。数据集包含10,274个数据点,质量评级为A,相对质量提升55.0%。领域分布包括个人成长(12%)、建筑设计(10%)和科学(8%)。语言分布以日语(26%)、俄语(16%)和韩语(12%)为主。语气多为信息性(28%)、有帮助性(10%)和哲学性(8%)。
This dataset is a remastered version consisting of single-sentence completions spanning a diverse range of languages, including Russian, German, Spanish, Korean, Chinese, English, French, and Japanese. The content varies widely, covering topics from historical restoration and travel logistics to proverbs and daily observations. Each entry is presented as an isolated text completion without additional context or paired prompts. The dataset contains 10,274 data points, with a quality grade of A and a relative quality improvement of 55.0%. Domain distribution includes personal-growth (12%), architecture-design (10%), and science (8%). Language distribution is dominated by Japanese (26%), Russian (16%), and Korean (12%). The tone is mostly informative (28%), helpful (10%), and philosophical (8%).
提供机构:
Reubencf



