LM Contamination Index
收藏arXiv2025-09-30 收录
下载链接:
https://hitz-zentroa.github.io/lm-contamination/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集定期更新,对一系列开放和专有模型的污染程度进行估算。此外,该指数通过零样本提示模型生成特定数据集的实例,详细说明了所需的分割和格式,以便评估语言模型中的数据污染情况。
This dataset is periodically updated to estimate data contamination levels across a range of open and proprietary models. Furthermore, this index generates instances of specific datasets by zero-shot prompting models, and specifies the required data splits and formats to facilitate the evaluation of data contamination in language models.
提供机构:
Hitz-ZentrumA



