THUIR/AEOLLM
收藏Hugging Face2026-03-17 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/THUIR/AEOLLM
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- question-answering
- summarization
- text-generation
language:
- en
- zh
pretty_name: aeollm
configs:
- config_name: aeollm_1
data_files:
- split: train
path: "aeollm-1-train/*.csv"
- split: test
path: "aeollm-1-test/*.csv"
- config_name: aeollm_2
default: true
data_files:
- split: train
path: "aeollm-2-train/*.csv"
- split: test
path: "aeollm-2-test/*.csv"
---
The repository maintains the datasets for the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) Task and the NTCIR-19 Automatic Evaluation of LLMs (AEOLLM) 2 Task.
The `aeollm_1` configuration corresponds to the NTCIR-18 AEOLLM Task, and the `aeollm_2` configuration corresponds to the NTCIR-19 AEOLLM 2 Task.
For AEOLLM2, the document corresponding to each answerId is available in the following Google Drive folder: [https://drive.google.com/drive/folders/1ujR5Gj889Y8RbK2eBmA-fikBQ1qcjXDe?usp=sharing](https://drive.google.com/drive/folders/1ujR5Gj889Y8RbK2eBmA-fikBQ1qcjXDe?usp=sharing).
- The train set includes human annotation for participants to reference when designing their methods.
- The test set does not contain human annotation and is used to generate a leaderboard [https://huggingface.co/spaces/THUIR/AEOLLM](https://huggingface.co/spaces/THUIR/AEOLLM).
You can load the datasets as follows:
```python
from datasets import load_dataset
ds1 = load_dataset("THUIR/AEOLLM", "aeollm_1")
train_1 = ds1["train"]
test_1 = ds1["test"]
ds2 = load_dataset("THUIR/AEOLLM", "aeollm_2")
train_2 = ds2["train"]
test_2 = ds2["test"]
```
More details about AEOLLM can be found at: [https://huggingface.co/spaces/THUIR/AEOLLM](https://huggingface.co/spaces/THUIR/AEOLLM)
提供机构:
THUIR



