JCommonsenseQA
收藏魔搭社区2025-11-27 更新2025-02-15 收录
下载链接:
https://modelscope.cn/datasets/sbintuitions/JCommonsenseQA
下载链接
链接失效反馈官方服务:
资源简介:
評価スコアの再現性確保と SB Intuitions 修正版の公開用クローン
ソース: [yahoojapan/JGLUE on GitHub](https://github.com/yahoojapan/JGLUE/tree/main)
- [datasets/jcommonsenseqa-v1.1](https://github.com/yahoojapan/JGLUE/tree/v1.1.0/datasets/jcommonsenseqa-v1.1)
# JCommonsenseQA
> JCommonsenseQA is a Japanese version of CommonsenseQA (Talmor+, 2019), which is a multiple-choice question answering dataset that requires commonsense reasoning ability.
> It is built using crowdsourcing with seeds extracted from the knowledge base ConceptNet.
## Licensing Information
[Creative Commons Attribution Share Alike 4.0 International](https://github.com/yahoojapan/JGLUE/blob/main/LICENSE)
## Citation Information
```
@article{栗原 健太郎2023,
title={JGLUE: 日本語言語理解ベンチマーク},
author={栗原 健太郎 and 河原 大輔 and 柴田 知秀},
journal={自然言語処理},
volume={30},
number={1},
pages={63-87},
year={2023},
url = "https://www.jstage.jst.go.jp/article/jnlp/30/1/30_63/_article/-char/ja",
doi={10.5715/jnlp.30.63}
}
@inproceedings{kurihara-etal-2022-jglue,
title = "{JGLUE}: {J}apanese General Language Understanding Evaluation",
author = "Kurihara, Kentaro and
Kawahara, Daisuke and
Shibata, Tomohide",
booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
month = jun,
year = "2022",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://aclanthology.org/2022.lrec-1.317",
pages = "2957--2966",
abstract = "To develop high-performance natural language understanding (NLU) models, it is necessary to have a benchmark to evaluate and analyze NLU ability from various perspectives. While the English NLU benchmark, GLUE, has been the forerunner, benchmarks are now being released for languages other than English, such as CLUE for Chinese and FLUE for French; but there is no such benchmark for Japanese. We build a Japanese NLU benchmark, JGLUE, from scratch without translation to measure the general NLU ability in Japanese. We hope that JGLUE will facilitate NLU research in Japanese.",
}
@InProceedings{Kurihara_nlp2022,
author = "栗原健太郎 and 河原大輔 and 柴田知秀",
title = "JGLUE: 日本語言語理解ベンチマーク",
booktitle = "言語処理学会第28回年次大会",
year = "2022",
url = "https://www.anlp.jp/proceedings/annual_meeting/2022/pdf_dir/E8-4.pdf"
note= "in Japanese"
}
```
# Subsets
## default
- `q_id` (`str`): 質問を一意識別するための ID
- `question` (`str`): 質問文, (未 NFKC正規化)
- `choice{0..4}` (`str`): 選択肢(`choice0`〜`choice4` の 5つ), (未 NFKC正規化)
- `label` (`int`): `choice{0..4}` に対応した正解選択肢のインデックス(0-4)
保障评估分数可复现性并公开SB Intuitions修正版的克隆项目
来源:GitHub上的yahoojapan/JGLUE仓库(https://github.com/yahoojapan/JGLUE/tree/main)
- 数据集子目录:datasets/jcommonsenseqa-v1.1(https://github.com/yahoojapan/JGLUE/tree/v1.1.0/datasets/jcommonsenseqa-v1.1)
# JCommonsenseQA 数据集
JCommonsenseQA 是 CommonsenseQA(Talmor 等,2019)的日语版本,该数据集为需要运用常识推理能力的多项选择问答数据集。其构建采用众包方式,种子数据取自知识库概念网络(ConceptNet)。
## 授权信息
采用知识共享署名-相同方式共享4.0国际版(Creative Commons Attribution Share Alike 4.0 International)授权,详情参见:https://github.com/yahoojapan/JGLUE/blob/main/LICENSE
## 引用信息
@article{kurihara_2023,
title={JGLUE:日语语言理解基准},
author={栗原健太郎 and 河原大辅 and 柴田知秀},
journal={自然语言处理},
volume={30},
number={1},
pages={63-87},
year={2023},
url = "https://www.jstage.jst.go.jp/article/jnlp/30/1/30_63/_article/-char/ja",
doi={10.5715/jnlp.30.63}
}
@inproceedings{kurihara-etal-2022-jglue,
title = "{JGLUE}: {J}apanese General Language Understanding Evaluation",
author = "Kurihara, Kentaro and
Kawahara, Daisuke and
Shibata, Tomohide",
booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
month = jun,
year = "2022",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://aclanthology.org/2022.lrec-1.317",
pages = "2957--2966",
abstract = "为开发高性能自然语言理解(Natural Language Understanding,以下简称NLU)模型,需要构建一个基准以从多维度评估与分析NLU能力。作为先驱的英语NLU基准GLUE之后,学界已推出多语言NLU基准,如中文的CLUE与法语的FLUE,但日语领域尚无此类基准。本研究从零开始构建日语NLU基准JGLUE,未借助机器翻译,旨在衡量日语环境下的通用NLU能力。我们期望JGLUE能够推动日语NLU领域的研究发展。",
}
@InProceedings{Kurihara_nlp2022,
author = "栗原健太郎 and 河原大辅 and 柴田知秀",
title = "JGLUE:日语语言理解基准",
booktitle = "日本自然语言处理学会第28届年会",
year = "2022",
url = "https://www.anlp.jp/proceedings/annual_meeting/2022/pdf_dir/E8-4.pdf",
note = "该论文以日语撰写"
}
## 数据集子集
### 默认子集
- `q_id` ("str"):用于唯一标识问题的ID
- `question` ("str"):问题文本(未经过NFKC标准化)
- `choice{0..4}` ("str"):候选选项(共5个,从choice0至choice4),未经过NFKC标准化
- `label` ("int"):对应`choice{0..4}`的正确选项索引(取值范围为0至4)
提供机构:
maas
创建时间:
2025-02-13



