five

lmvasque/coh-metrix-esp

收藏
Hugging Face2022-11-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/lmvasque/coh-metrix-esp
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 --- ## About this dataset The dataset Coh-Metrix-Esp (Cuentos) [(Quispesaravia et al., 2016)](https://aclanthology.org/L16-1745/) is a collection of 100 documents consisting of 50 children fables (“simple” texts) and 50 stories for adults (“complex” texts) scrapped from the web. If you use this data, please credit the original website and our work as well (see citations below). ## Citation If you use our splits in your research, please cite our work: "[A Benchmark for Neural Readability Assessment of Texts in Spanish](https://drive.google.com/file/d/1KdwvqrjX8MWYRDGBKeHmiR1NCzDcVizo/view?usp=share_link)". ``` @inproceedings{vasquez-rodriguez-etal-2022-benchmarking, title = "A Benchmark for Neural Readability Assessment of Texts in Spanish", author = "V{\'a}squez-Rodr{\'\i}guez, Laura and Cuenca-Jim{\'\e}nez, Pedro-Manuel and Morales-Esquivel, Sergio Esteban and Alva-Manchego, Fernando", booktitle = "Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022), EMNLP 2022", month = dec, year = "2022", } ``` #### Coh-Metrix-Esp (Cuentos) ``` @inproceedings{quispesaravia-etal-2016-coh, title = "{C}oh-{M}etrix-{E}sp: A Complexity Analysis Tool for Documents Written in {S}panish", author = "Quispesaravia, Andre and Perez, Walter and Sobrevilla Cabezudo, Marco and Alva-Manchego, Fernando", booktitle = "Proceedings of the Tenth International Conference on Language Resources and Evaluation ({LREC}'16)", month = may, year = "2016", address = "Portoro{\v{z}}, Slovenia", publisher = "European Language Resources Association (ELRA)", url = "https://aclanthology.org/L16-1745", pages = "4694--4698", } ``` You can also find more details about the project in our [GitHub](https://github.com/lmvasque/readability-es-benchmark).
提供机构:
lmvasque
原始信息汇总

数据集概述

数据集名称

Coh-Metrix-Esp (Cuentos)

数据集内容

  • 文档数量:100篇
  • 文档类型
    • 50篇儿童寓言(“简单”文本)
    • 50篇成人故事(“复杂”文本)

数据来源

从网络收集

使用许可

CC-BY-SA-4.0

引用信息

  • 引用文献
    • Quispesaravia, Andre, et al. "Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written in Spanish." Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC16), May 2016, Portorož, Slovenia, pp. 4694-4698.
    • Vásquez-Rodríguez, Laura, et al. "A Benchmark for Neural Readability Assessment of Texts in Spanish." Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022), EMNLP 2022, Dec 2022.

引用要求

使用此数据集时,请同时引用原始网站和上述研究工作。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作