five

priorcomputers/openreview_raw

收藏
Hugging Face2026-03-13 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/priorcomputers/openreview_raw
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: forum_id dtype: string - name: forum_title dtype: string - name: forum_authors sequence: string - name: forum_abstract dtype: string - name: forum_keywords sequence: string - name: forum_pdf_url dtype: string - name: forum_url dtype: string - name: note_id dtype: string - name: note_type dtype: string - name: note_created dtype: int64 - name: note_replyto dtype: string - name: note_readers sequence: string - name: note_signatures sequence: string - name: venue dtype: string - name: year dtype: string - name: note_text dtype: string splits: - name: train num_bytes: 2565679898 num_examples: 626430 download_size: 758998779 dataset_size: 2565679898 configs: - config_name: default data_files: - split: train path: data/train-* license: odc-by language: - en tags: - peer-review - openreview - scientific-reviews size_categories: - 100K<n<1M --- # OpenReview Raw Raw peer review data from OpenReview, covering major ML/AI venues (ICLR, NeurIPS, EMNLP, COLM, ACM MM, and more). Includes reviews, official comments, meta-reviews, and decisions for 49,023 unique papers. Originally from [`sumukshashidhar-archive/openreview_raw`](https://huggingface.co/datasets/sumukshashidhar-archive/openreview_raw). This dataset is a compilation of publicly available data from OpenReview. All original content and data rights belong to OpenReview. This compilation is made available under the Open Data Commons Attribution License (ODC-By). Users must attribute both this compilation and the original source (OpenReview) in any use of this dataset. ## Dataset Statistics | Statistic | Value | |-----------|-------| | Total rows | 626,430 | | Unique papers | 49,023 | | Unique venues | 349 | | Year range | 2013–2025 | ### Note Types | Type | Count | % | |------|-------|---| | official_comment | 349,653 | 55.8% | | official_review | 186,462 | 29.8% | | decision | 31,450 | 5.0% | | review | 28,616 | 4.6% | | comment | 16,753 | 2.7% | | meta_review | 13,496 | 2.2% | ### Top Venues | Venue | Count | |-------|-------| | ICLR 2025 | 198,960 | | ICLR 2024 | 110,570 | | NeurIPS 2024 | 75,555 | | NeurIPS 2023 | 64,562 | | EMNLP 2023 | 22,742 | | NeurIPS 2022 | 16,278 | | ICLR 2022 | 14,593 | | NeurIPS 2021 | 13,605 | | ICLR 2021 | 12,275 | | ICLR 2019 | 11,916 | ### Year Distribution | Year | Count | |------|-------| | 2013 | 373 | | 2014 | 651 | | 2016 | 295 | | 2017 | 626 | | 2018 | 1,158 | | 2019 | 14,284 | | 2020 | 12,979 | | 2021 | 35,943 | | 2022 | 44,621 | | 2023 | 96,525 | | 2024 | 219,635 | | 2025 | 199,340 | ### Note Text Length (characters) | Statistic | Value | |-----------|-------| | Mean | 2,268 | | Median | 2,023 | | Min | 10 | | Max | 56,453 | ## Schema - **forum_id** — OpenReview forum identifier (one per paper) - **forum_title** — Paper title - **forum_authors** — List of paper authors - **forum_abstract** — Paper abstract - **forum_keywords** — Paper keywords - **forum_pdf_url** — Link to PDF on OpenReview - **forum_url** — Link to forum on OpenReview - **note_id** — Unique identifier for this note (review/comment/decision) - **note_type** — One of: `official_review`, `official_comment`, `decision`, `review`, `comment`, `meta_review` - **note_created** — Unix timestamp (milliseconds) of note creation - **note_replyto** — ID of the note this is replying to - **note_readers** — List of reader groups with access - **note_signatures** — List of note author signatures - **venue** — Conference/venue identifier - **year** — Publication year - **note_text** — Full text content of the note
提供机构:
priorcomputers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作