admin-strator/adaption-mixed-domain-text-snippets
收藏Hugging Face2026-04-30 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/admin-strator/adaption-mixed-domain-text-snippets
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含来自多个领域的短文本摘录,涵盖个人博客、软件文档、新闻文章、产品描述、军事报告和法律通知等。样本在风格、语气和主题上差异显著,从非正式叙述到正式法规更新不等。每个条目都是独立的文本片段,适用于领域分类、风格转换或通用语言建模等任务。数据集共有124个数据点,质量评级为B,质量相对提升了335.0%。领域分布为科学(24%)、治理(12%)和音乐(6%),语言为英语(100%),语气分布为信息性(35%)、分析性(18%)和吸引性(6%)。
This dataset comprises a heterogeneous collection of short text excerpts spanning diverse domains including personal blogs, software documentation, news articles, product descriptions, military reports, and legal notices. The samples vary significantly in style, tone, and subject matter, ranging from informal narratives to formal regulatory updates. Each entry serves as an independent text segment suitable for tasks involving domain classification, style transfer, or general language modeling. There are 124 data points in this dataset, with a quality grade of B and a relative quality improvement of 335.0%. The domain distribution is Science (24%), Governance (12%), and Music (6%). The language is English (100%), and the tone distribution is Informative (35%), Analytical (18%), and Engaging (6%).
提供机构:
admin-strator



