mjbommar/opengloss-v1.1-drafting
收藏Hugging Face2025-12-13 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/mjbommar/opengloss-v1.1-drafting
下载链接
链接失效反馈官方服务:
资源简介:
OpenGloss Drafting是一个合成的教育内容草稿数据集,由词汇术语和百科全书条目生成。每个草稿都是一个独立的写作作品(如文章、故事、备忘录、论文等),自然地融入了特定的词汇术语及其定义和上下文。该数据集支持课程对齐的内容生成、上下文词汇学习和教育文本合成。数据集包含27,635条记录,48.7百万总词数,37,504个独特词汇术语,23种文档类型,6种复杂度级别(从小学到专业水平),431个独特目标受众。数据集的来源包括原始草稿、受众适应的重写和推理增强的草稿。
OpenGloss Drafting is a synthetic dataset of educational content drafts generated from vocabulary terms and encyclopedia entries. Each draft is a self-contained piece of writing (article, story, memo, essay, etc.) that naturally incorporates specific vocabulary terms with their definitions and context. This dataset supports curriculum-aligned content generation, vocabulary-in-context learning, and educational text synthesis. It contains 27,635 draft records, 48.7 million total words, 37,504 unique vocabulary terms, 23 artifact types, 6 complexity levels (elementary through professional), and 431 unique target audiences. The dataset includes original drafts, audience-adapted rewrites, and reasoning-augmented drafts.
提供机构:
mjbommar



