Name: taln-ls2n/ACL-rlg
Creator: taln-ls2n
Published: 2026-02-17 13:38:21
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/taln-ls2n/ACL-rlg

下载链接

链接失效反馈

官方服务：

资源简介：

# ACL-rlg: A Dataset for Reading List Generation # About ACL-rlg is the largest dataset of expert-crafted reading lists, containing 85 reading lists manually extracted from tutorial papers submitted to ACL-related conferences between 2020 and 2024. Data was sourced from [ACL Anthology](https://aclanthology.org/) and cross-referenced with [Semantic Scholar](https://www.semanticscholar.org/), enabling the extraction of metadata for articles beyond the ACL collection. # Content The following data fields are available : | **Field** | **Type** | **Description** | | -------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `id` | `string` | Unique identifier of the tutorial paper in the ACL Anthology. | | `title` | `string` | Title of the tutorial paper. | | `abstract` | `string` | Abstract of the tutorial paper. | | | `year` | `int64` | Year of publication. | | `url` | `string` | ACL Anthology link to the paper. | `venues` | `string` | Name of the venues the tutorial paper is published in. | | | `reading_list` | `list[object]` | Reading list provided by the authors of the paper. Each record includes: • `corpusid` (`int64`): Semantic Scholar corpus ID. • `paperId` (`string`): Semantic Scholar paper ID. • `title` (`string`): Title of the referenced paper. • `abstract` (`string`): Abstract of the referenced paper. • `authors` (`list[object]`): Informations about referenced paper's authors. • `venue` (`string`): Name of the venue the referenced paper is published in. • `year` (`int64`): Year of publication of the referenced paper. • `in_acl` (`bool`): Boolean indicating if the referenced is referenced in ACL Anthology. • `citationCount` (`int64`): Citation count of the paper extracted from Semantic Scholar API. • `section` (`string`): Name of the section of the reading list the referenced paper is listed in. • `subsection` (`string`): Name of the subsection of the reading list the referenced paper is listed in.| ## Licence Dataset: CC BY-NC 4.0 If you use this dataset you may use, share, and adapt the dataset for non-commercial research or educational purposes only. ## Citation ``` @inproceedings{aubert-beduchaud-etal-2025-acl, title = "{ACL}-rlg: A Dataset for Reading List Generation", author = "Aubert-B{\'e}duchaud, Julien and Boudin, Florian and Daille, B{\'e}atrice and Dufour, Richard", editor = "Rambow, Owen and Wanner, Leo and Apidianaki, Marianna and Al-Khalifa, Hend and Eugenio, Barbara Di and Schockaert, Steven", booktitle = "Proceedings of the 31st International Conference on Computational Linguistics", month = jan, year = "2025", address = "Abu Dhabi, UAE", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.coling-main.327/", pages = "4910--4919", abstract = "Familiarizing oneself with a new scientific field and its existing literature can be daunting due to the large amount of available articles. Curated lists of academic references, or reading lists, compiled by experts, offer a structured way to gain a comprehensive overview of a domain or a specific scientific challenge. In this work, we introduce ACL-rlg, the largest open expert-annotated reading list dataset. We also provide multiple baselines for evaluating reading list generation and formally define it as a retrieval task. Our qualitative study highlights that traditional scholarly search engines and indexing methods perform poorly on this task, and GPT-4o, despite showing better results, exhibits signs of potential data contamination." } ``` Julien Aubert-Béduchaud, Florian Boudin, Béatrice Daille, and Richard Dufour. 2025. [ACL-rlg: A Dataset for Reading List Generation.](https://aclanthology.org/2025.coling-main.327/) In Proceedings of the 31st International Conference on Computational Linguistics, pages 4910–4919, Abu Dhabi, UAE. Association for Computational Linguistics.

应用场景：