rntc/pubmed-quality-annotations-Llama-3.3-70B-Instruct-2
收藏Hugging Face2025-10-09 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/rntc/pubmed-quality-annotations-Llama-3.3-70B-Instruct-2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本内容、相关解释、教育相关评分、写作质量评分、内容丰富度评分、术语精度评分等特征。它还包含是否需要改写、是否适合预训练、是否含有偏见等布尔特征,以及关于写作风格、内容类型、医学子领域等分类信息。此外,还包括年龄、性别、文本类型、ID、文章ID、路径等元数据信息。数据集划分为训练集,共有630000个样本。数据集支持默认配置,用于指定训练数据的文件路径。
The dataset includes features such as text content, explanations, educational scores, writing quality scores, content richness scores, terminology precision scores, etc. It also contains boolean features like whether rewriting is needed, whether it is suitable for pre-training, and whether it contains bias, as well as categorical information about writing style, content type, medical subfield, etc. In addition, it includes metadata such as age, gender, text type, ID, article ID, path, etc. The dataset is split into a training set with a total of 630,000 samples. The dataset supports a default configuration for specifying the file path of the training data.
提供机构:
rntc



