five

作文数据

收藏
北京国际大数据交易所2024-12-31 收录
下载链接:
https://webs.bjidex.com/sys-bsc-home/#/bscConsole/tradingMarket/detail?id=4021
下载链接
链接失效反馈
官方服务:
资源简介:
中小学作文数据产品,是教育领域极具价值的资源宝库,专为教育大模型训练及多种教育场景应用而精心打造。涵盖初中语文、初中英语以及小高(4 - 6 年级)语文的作文数据。在初中语文部分,精心准备了丰富多样的作文题目,涵盖记叙文、议论文等常见文体,且数量充足。每个题目均收集了来自大型联考中以班级为单位的大量学生作答,全面覆盖当地地市如河南中考作文评阅标准中的 1 类作文(50 - 45 分)到 6 类作文(0 - 10 分)的各个等级,为精准教学分析提供了丰富样本。题目与作答数据均保证字迹清晰、卷面整洁,以确保数据质量。初中英语方面,同样拥有数量可观的作文题目,每个题目也有来自班级为单位的众多学生作答,完整覆盖本地市中考评阅标准中的 1 类(1 - 5 分)到 4 类(16 - 20 分)等级。这些数据源于大型联考,优先选取字迹清晰、卷面整洁的作答,为英语作文教学研究与模型训练提供有力支撑。小高语文包含较多的作文题目,其中书信类也占有一定比例。每个题目拥有大量班级单位的学生作答,全面覆盖珠三角地区等当地评阅标准中的 A 类(27 - 30 分)到 D 类(18 分以下)等级。数据来自大型联考,注重字迹清晰与卷面整洁,助力小学高年级作文教学与研究。① 作文数据产品由“错题本”服务中,学生提供的答题卡进行匿名化加工后形成,不含个人信息。试题产品由“错题本”服务中老师上传的试题加工形成。②所附隐私政策在用户注册登录APP时由用户点击确认接受,已提示未成年人在成年人陪同下阅读。数据授权政策出现的场景是用户申请使用“错题本”功能时,试题和答题卡数据的加工使用已获完整授权。

K-12 (primary and secondary school) composition dataset product is a highly valuable resource repository in the education sector, meticulously developed for training educational large language models (LLMs) and supporting various educational scenario applications. It covers composition data for junior high school Chinese, junior high school English, and upper primary school (grades 4-6) Chinese. For the junior high school Chinese section, a rich variety of composition prompts are prepared, covering common genres such as narrative essays and argumentative essays, with sufficient total volume. For each prompt, a large number of student responses collected from large-scale joint exams at the class level are available, fully covering all rating tiers from Category 1 compositions (50-45 points) to Category 6 compositions (0-10 points) as specified in the senior high school entrance examination composition grading standards of the local prefecture-level city, with Henan Province's grading criteria as a typical reference. Both the prompts and response data feature clear handwriting and neat paper surfaces to ensure high data quality. For junior high school English, there are also a considerable number of composition prompts, with each prompt accompanied by numerous student responses collected from large-scale joint exams at the class level, fully covering all rating tiers from Category 1 (1-5 points) to Category 4 (16-20 points) outlined in the local prefecture-level city's senior high school entrance examination grading standards. These data originate from large-scale joint exams, with responses with clear handwriting and neat paper surfaces preferentially selected, providing strong support for English composition teaching research and model training. The upper primary school Chinese section includes a large number of composition prompts, with a certain proportion of letter-writing tasks included. Each prompt has a large number of student responses collected from large-scale joint exams at the class level, fully covering all rating tiers from Category A (27-30 points) to Category D (below 18 points) as specified in the local grading standards of the Pearl River Delta region. These data come from large-scale joint exams, with priority given to responses with clear handwriting and neat paper surfaces, supporting senior primary school composition teaching and research. 1. The composition dataset product is formed by anonymizing and processing the answer sheets submitted by students in the "Error Notebook" service, which contains no personal identifiable information (PII). The test question product is processed from the test questions uploaded by teachers in the "Error Notebook" service. 2. The attached privacy policy is confirmed and accepted by users when they register and log in to the APP, and minors have been prompted to read it under the accompaniment of adults. The data authorization policy is presented when users apply to use the "Error Notebook" function, and the processing and use of test question and answer sheet data have obtained complete authorization.
提供机构:
安徽七天网络科技有限公司
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个中小学作文数据产品,专为教育大模型训练和教学应用而设计,涵盖初中语文、初中英语和小高语文的作文题目及学生作答。数据来源于大型联考,覆盖了从高分到低分的各个评分等级,并经过匿名化处理,确保字迹清晰和卷面整洁,为精准教学分析和研究提供丰富样本。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作