five

Small Corpus of Colombian English as a Second Language Essays (SCoCESLE)

收藏
DataCite Commons2025-05-01 更新2025-05-17 收录
下载链接:
https://data.mendeley.com/datasets/wfcbfy29wm
下载链接
链接失效反馈
官方服务:
资源简介:
The Small Corpus of Colombian English as a Second Language Essays (SCoCESLE) can be classified as a small learner corpus. SCoCESLE is made up of 272 argumentative essays written by Colombian English as a Second Language (ESL) learners. It has a total of 81,994 tokens, 6,057 types, and 5,161 lemmas. Each essay has an average length of about 270 words. Essay topics include gender-related issues, education, information technology, environmental problems, personality traits, poverty, genetic engineering, globalisation, pets ownership, compulsory vaccination, transportation, compulsory military conscription, immigration, job satisfaction, economy, foreign language learning, and employees working conditions. The texts in the corpus were written by male (n=157), female (n=114) and gender fluid (n=1) adult learners (i.e., 18+). The learners’ first language is Colombian Spanish. The corpus is unannotated and is divided into a lower proficiency sub-corpus (n=133) and a higher proficiency one (n=139). The data presented here includes: 1. The corpus manual (.pdf) 2. The corpus metadata (.xls) 3. The corpus of 272 unannotated plain texts (.txt) 4. The sub-corpus of 133 lower proficiency texts (.txt) 5. The sub-corpus of 139 higher proficiency texts (.txt)
提供机构:
Mendeley Data
创建时间:
2023-11-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作