vgaraujov/thesis-chile
收藏Thesis Chile 数据集
数据集概述
Thesis Chile 数据集部分用于创建 DiscoEval in Spanish 基准测试。该数据集通过抓取智利论文的标题和摘要创建,来源包括智利天主教大学(repositorio.uc.cl)、智利大学(repositorio.uchile.cl)和圣玛丽亚理工大学(biblioteca.usm.cl)的公共仓库。
支持的任务
该数据集适用于判别和生成任务。对于分类任务,标题-摘要对提供了评估语义相似性或蕴含关系的机会。在生成任务中,摘要可以作为模型生成标题(总结)的输入。
引用信息
@inproceedings{araujo-etal-2022-evaluation, title = "Evaluation Benchmarks for {S}panish Sentence Representations", author = "Araujo, Vladimir and Carvallo, Andr{e}s and Kundu, Souvik and Ca{~n}ete, Jos{e} and Mendoza, Marcelo and Mercer, Robert E. and Bravo-Marquez, Felipe and Moens, Marie-Francine and Soto, Alvaro", booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference", month = jun, year = "2022", address = "Marseille, France", publisher = "European Language Resources Association", url = "https://aclanthology.org/2022.lrec-1.648", pages = "6024--6034", }



