five

crklih/turkish-cefr-phrases

收藏
Hugging Face2026-03-08 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/crklih/turkish-cefr-phrases
下载链接
链接失效反馈
官方服务:
资源简介:
# Turkish CEFR Phrases A dataset of Turkish phrases extracted from YouTube videos, labeled with CEFR proficiency levels (A1–C2). ## Dataset Details - **Language:** Turkish - **Size:** 174,526 phrases - **Labels:** A1, A2, B1, B2, C1, C2 ## Label Distribution | Level | Count | % | |-------|---------|-------| | A2 | 70,626 | 40.5% | | B1 | 57,758 | 33.1% | | B2 | 34,728 | 19.9% | | A1 | 7,016 | 4.0% | | C1 | 4,304 | 2.5% | | C2 | 94 | 0.05% | ## Data Fields - `phrase_text` — Turkish phrase - `cefr_level` — CEFR level label (A1 to C2) - `cefr_confidence` — Model confidence score (avg: 0.895) ## Labeling Model Labels were generated using [crklih/turkish-cefr-classifier](https://huggingface.co/crklih/turkish-cefr-classifier). ## Source Phrases are short excerpts from publicly available Turkish video content on the internet. This dataset is intended for linguistic research and language learning purposes only. ## License [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)
提供机构:
crklih
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作