Name: africa-intelligence/aya101-benchmarking
Creator: africa-intelligence
Published: 2024-10-01 16:21:44
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/africa-intelligence/aya101-benchmarking

下载链接

链接失效反馈

官方服务：

资源简介：

--- pretty_name: Evaluation run of CohereForAI/aya-101 dataset_summary: "Dataset automatically created during the evaluation run of model\ \ [CohereForAI/aya-101](https://huggingface.co/CohereForAI/aya-101)\nThe dataset\ \ is composed of 5 configuration(s), each one corresponding to one of the evaluated\ \ task.\n\nThe dataset has been created from 2 run(s). Each run can be found as\ \ a specific split in each configuration, the split being named using the timestamp\ \ of the run.The \"train\" split is always pointing to the latest results.\n\nAn\ \ additional configuration \"results\" store all the aggregated results of the run.\n\ \nTo load the details from a run, you can for instance do the following:\n```python\n\ from datasets import load_dataset\ndata = load_dataset(\n\t\"africa-intelligence/aya101-benchmarking\"\ ,\n\tname=\"CohereForAI__aya-101__afrimgsm_direct_xho\",\n\tsplit=\"latest\"\n)\n\ ```\n\n## Latest results\n\nThese are the [latest results from run 2024-10-01T16-21-34.420635](https://huggingface.co/datasets/africa-intelligence/aya101-benchmarking/blob/main/CohereForAI/aya-101/results_2024-10-01T16-21-34.420635.json)\ \ (note that there might be results for other tasks in the repos if successive evals\ \ didn't cover the same tasks. You find each in the results and the \"latest\" split\ \ for each eval):\n\n```python\n{\n \"all\": {\n \"afrimgsm_direct_xho\"\ : {\n \"alias\": \"afrimgsm_direct_xho\",\n \"exact_match,remove_whitespace\"\ : 0.004,\n \"exact_match_stderr,remove_whitespace\": 0.004000000000000003,\n\ \ \"exact_match,flexible-extract\": 0.044,\n \"exact_match_stderr,flexible-extract\"\ : 0.012997373846574952\n },\n \"afrimgsm_direct_zul\": {\n \ \ \"alias\": \"afrimgsm_direct_zul\",\n \"exact_match,remove_whitespace\"\ : 0.0,\n \"exact_match_stderr,remove_whitespace\": 0.0,\n \ \ \"exact_match,flexible-extract\": 0.02,\n \"exact_match_stderr,flexible-extract\"\ : 0.008872139507342683\n },\n \"afrimmlu_direct_xho\": {\n \ \ \"alias\": \"afrimmlu_direct_xho\",\n \"acc,none\": 0.316,\n \ \ \"acc_stderr,none\": 0.020812359515855857,\n \"f1,none\":\ \ 0.3121412403731796,\n \"f1_stderr,none\": \"N/A\"\n },\n \ \ \"afrimmlu_direct_zul\": {\n \"alias\": \"afrimmlu_direct_zul\"\ ,\n \"acc,none\": 0.298,\n \"acc_stderr,none\": 0.02047511809298895,\n\ \ \"f1,none\": 0.30077002468766567,\n \"f1_stderr,none\":\ \ \"N/A\"\n },\n \"afrixnli_en_direct_xho\": {\n \"alias\"\ : \"afrixnli_en_direct_xho\",\n \"acc,none\": 0.5366666666666666,\n \ \ \"acc_stderr,none\": 0.020374439597383796,\n \"f1,none\"\ : 0.4396227279523235,\n \"f1_stderr,none\": \"N/A\"\n },\n \ \ \"afrixnli_en_direct_zul\": {\n \"alias\": \"afrixnli_en_direct_zul\"\ ,\n \"acc,none\": 0.5433333333333333,\n \"acc_stderr,none\"\ : 0.020352577627018392,\n \"f1,none\": 0.4400411624098575,\n \ \ \"f1_stderr,none\": \"N/A\"\n }\n },\n \"afrimgsm_direct_xho\"\ : {\n \"alias\": \"afrimgsm_direct_xho\",\n \"exact_match,remove_whitespace\"\ : 0.004,\n \"exact_match_stderr,remove_whitespace\": 0.004000000000000003,\n\ \ \"exact_match,flexible-extract\": 0.044,\n \"exact_match_stderr,flexible-extract\"\ : 0.012997373846574952\n },\n \"afrimgsm_direct_zul\": {\n \"alias\"\ : \"afrimgsm_direct_zul\",\n \"exact_match,remove_whitespace\": 0.0,\n \ \ \"exact_match_stderr,remove_whitespace\": 0.0,\n \"exact_match,flexible-extract\"\ : 0.02,\n \"exact_match_stderr,flexible-extract\": 0.008872139507342683\n\ \ },\n \"afrimmlu_direct_xho\": {\n \"alias\": \"afrimmlu_direct_xho\"\ ,\n \"acc,none\": 0.316,\n \"acc_stderr,none\": 0.020812359515855857,\n\ \ \"f1,none\": 0.3121412403731796,\n \"f1_stderr,none\": \"N/A\"\n\ \ },\n \"afrimmlu_direct_zul\": {\n \"alias\": \"afrimmlu_direct_zul\"\ ,\n \"acc,none\": 0.298,\n \"acc_stderr,none\": 0.02047511809298895,\n\ \ \"f1,none\": 0.30077002468766567,\n \"f1_stderr,none\": \"N/A\"\n\ \ },\n \"afrixnli_en_direct_xho\": {\n \"alias\": \"afrixnli_en_direct_xho\"\ ,\n \"acc,none\": 0.5366666666666666,\n \"acc_stderr,none\": 0.020374439597383796,\n\ \ \"f1,none\": 0.4396227279523235,\n \"f1_stderr,none\": \"N/A\"\n\ \ },\n \"afrixnli_en_direct_zul\": {\n \"alias\": \"afrixnli_en_direct_zul\"\ ,\n \"acc,none\": 0.5433333333333333,\n \"acc_stderr,none\": 0.020352577627018392,\n\ \ \"f1,none\": 0.4400411624098575,\n \"f1_stderr,none\": \"N/A\"\n\ \ }\n}\n```" repo_url: https://huggingface.co/CohereForAI/aya-101 leaderboard_url: '' point_of_contact: '' configs: - config_name: CohereForAI__aya-101__afrimgsm_direct_xho data_files: - split: 2024_10_01T16_21_34.420635 path: - '**/samples_afrimgsm_direct_xho_2024-10-01T16-21-34.420635.jsonl' - split: latest path: - '**/samples_afrimgsm_direct_xho_2024-10-01T16-21-34.420635.jsonl' - config_name: CohereForAI__aya-101__afrimgsm_direct_zul data_files: - split: 2024_10_01T16_21_34.420635 path: - '**/samples_afrimgsm_direct_zul_2024-10-01T16-21-34.420635.jsonl' - split: latest path: - '**/samples_afrimgsm_direct_zul_2024-10-01T16-21-34.420635.jsonl' - config_name: CohereForAI__aya-101__afrimmlu_direct_xho data_files: - split: 2024_10_01T16_21_34.420635 path: - '**/samples_afrimmlu_direct_xho_2024-10-01T16-21-34.420635.jsonl' - split: latest path: - '**/samples_afrimmlu_direct_xho_2024-10-01T16-21-34.420635.jsonl' - config_name: CohereForAI__aya-101__afrimmlu_direct_zul data_files: - split: 2024_10_01T16_21_34.420635 path: - '**/samples_afrimmlu_direct_zul_2024-10-01T16-21-34.420635.jsonl' - split: latest path: - '**/samples_afrimmlu_direct_zul_2024-10-01T16-21-34.420635.jsonl' - config_name: CohereForAI__aya-101__afrixnli_en_direct_xho data_files: - split: 2024_10_01T16_21_34.420635 path: - '**/samples_afrixnli_en_direct_xho_2024-10-01T16-21-34.420635.jsonl' - split: latest path: - '**/samples_afrixnli_en_direct_xho_2024-10-01T16-21-34.420635.jsonl' - config_name: CohereForAI__aya-101__afrixnli_en_direct_zul data_files: - split: 2024_10_01T16_21_34.420635 path: - '**/samples_afrixnli_en_direct_zul_2024-10-01T16-21-34.420635.jsonl' - split: latest path: - '**/samples_afrixnli_en_direct_zul_2024-10-01T16-21-34.420635.jsonl' --- # Dataset Card for Evaluation run of CohereForAI/aya-101  Dataset automatically created during the evaluation run of model [CohereForAI/aya-101](https://huggingface.co/CohereForAI/aya-101) The dataset is composed of 5 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 2 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest results. An additional configuration "results" store all the aggregated results of the run. To load the details from a run, you can for instance do the following: ```python from datasets import load_dataset data = load_dataset( "africa-intelligence/aya101-benchmarking", name="CohereForAI__aya-101__afrimgsm_direct_xho", split="latest" ) ``` ## Latest results These are the [latest results from run 2024-10-01T16-21-34.420635](https://huggingface.co/datasets/africa-intelligence/aya101-benchmarking/blob/main/CohereForAI/aya-101/results_2024-10-01T16-21-34.420635.json) (note that there might be results for other tasks in the repos if successive evals didn't cover the same tasks. You find each in the results and the "latest" split for each eval): ```python { "all": { "afrimgsm_direct_xho": { "alias": "afrimgsm_direct_xho", "exact_match,remove_whitespace": 0.004, "exact_match_stderr,remove_whitespace": 0.004000000000000003, "exact_match,flexible-extract": 0.044, "exact_match_stderr,flexible-extract": 0.012997373846574952 }, "afrimgsm_direct_zul": { "alias": "afrimgsm_direct_zul", "exact_match,remove_whitespace": 0.0, "exact_match_stderr,remove_whitespace": 0.0, "exact_match,flexible-extract": 0.02, "exact_match_stderr,flexible-extract": 0.008872139507342683 }, "afrimmlu_direct_xho": { "alias": "afrimmlu_direct_xho", "acc,none": 0.316, "acc_stderr,none": 0.020812359515855857, "f1,none": 0.3121412403731796, "f1_stderr,none": "N/A" }, "afrimmlu_direct_zul": { "alias": "afrimmlu_direct_zul", "acc,none": 0.298, "acc_stderr,none": 0.02047511809298895, "f1,none": 0.30077002468766567, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_xho": { "alias": "afrixnli_en_direct_xho", "acc,none": 0.5366666666666666, "acc_stderr,none": 0.020374439597383796, "f1,none": 0.4396227279523235, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_zul": { "alias": "afrixnli_en_direct_zul", "acc,none": 0.5433333333333333, "acc_stderr,none": 0.020352577627018392, "f1,none": 0.4400411624098575, "f1_stderr,none": "N/A" } }, "afrimgsm_direct_xho": { "alias": "afrimgsm_direct_xho", "exact_match,remove_whitespace": 0.004, "exact_match_stderr,remove_whitespace": 0.004000000000000003, "exact_match,flexible-extract": 0.044, "exact_match_stderr,flexible-extract": 0.012997373846574952 }, "afrimgsm_direct_zul": { "alias": "afrimgsm_direct_zul", "exact_match,remove_whitespace": 0.0, "exact_match_stderr,remove_whitespace": 0.0, "exact_match,flexible-extract": 0.02, "exact_match_stderr,flexible-extract": 0.008872139507342683 }, "afrimmlu_direct_xho": { "alias": "afrimmlu_direct_xho", "acc,none": 0.316, "acc_stderr,none": 0.020812359515855857, "f1,none": 0.3121412403731796, "f1_stderr,none": "N/A" }, "afrimmlu_direct_zul": { "alias": "afrimmlu_direct_zul", "acc,none": 0.298, "acc_stderr,none": 0.02047511809298895, "f1,none": 0.30077002468766567, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_xho": { "alias": "afrixnli_en_direct_xho", "acc,none": 0.5366666666666666, "acc_stderr,none": 0.020374439597383796, "f1,none": 0.4396227279523235, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_zul": { "alias": "afrixnli_en_direct_zul", "acc,none": 0.5433333333333333, "acc_stderr,none": 0.020352577627018392, "f1,none": 0.4400411624098575, "f1_stderr,none": "N/A" } } ``` ## Dataset Details ### Dataset Description  - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] ### Dataset Sources [optional]  - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses  ### Direct Use  [More Information Needed] ### Out-of-Scope Use  [More Information Needed] ## Dataset Structure  [More Information Needed] ## Dataset Creation ### Curation Rationale  [More Information Needed] ### Source Data  #### Data Collection and Processing  [More Information Needed] #### Who are the source data producers?  [More Information Needed] ### Annotations [optional]  #### Annotation process  [More Information Needed] #### Who are the annotators?  [More Information Needed] #### Personal and Sensitive Information  [More Information Needed] ## Bias, Risks, and Limitations  [More Information Needed] ### Recommendations  Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional]  **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional]  [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]

# CohereForAI/aya-101模型评估运行的数据集卡片  该数据集是在模型[CohereForAI/aya-101](https://huggingface.co/CohereForAI/aya-101)的评估运行期间自动创建的。数据集包含5个配置项，每个配置项对应一项被评估的任务。数据集由2次运行生成。每次运行可在各配置项中作为特定分割找到，分割名称采用运行的时间戳。“train”分割始终指向最新结果。附加配置“results”存储所有运行的聚合结果。若要加载某次运行的详情，可参考以下示例代码： python from datasets import load_dataset data = load_dataset( "africa-intelligence/aya101-benchmarking", name="CohereForAI__aya-101__afrimgsm_direct_xho", split="latest" ) ## 最新结果这些是[2024-10-01T16-21-34.420635运行的最新结果](https://huggingface.co/datasets/africa-intelligence/aya101-benchmarking/blob/main/CohereForAI/aya-101/results_2024-10-01T16-21-34.420635.json)（注意：若连续评估未覆盖相同任务，仓库中可能存在其他任务的结果。可在results配置项及各评估的“latest”分割中找到）： python { "all": { "afrimgsm_direct_xho": { "alias": "afrimgsm_direct_xho", "exact_match,remove_whitespace": 0.004, "exact_match_stderr,remove_whitespace": 0.004000000000000003, "exact_match,flexible-extract": 0.044, "exact_match_stderr,flexible-extract": 0.012997373846574952 }, "afrimgsm_direct_zul": { "alias": "afrimgsm_direct_zul", "exact_match,remove_whitespace": 0.0, "exact_match_stderr,remove_whitespace": 0.0, "exact_match,flexible-extract": 0.02, "exact_match_stderr,flexible-extract": 0.008872139507342683 }, "afrimmlu_direct_xho": { "alias": "afrimmlu_direct_xho", "acc,none": 0.316, "acc_stderr,none": 0.020812359515855857, "f1,none": 0.3121412403731796, "f1_stderr,none": "N/A" }, "afrimmlu_direct_zul": { "alias": "afrimmlu_direct_zul", "acc,none": 0.298, "acc_stderr,none": 0.02047511809298895, "f1,none": 0.30077002468766567, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_xho": { "alias": "afrixnli_en_direct_xho", "acc,none": 0.5366666666666666, "acc_stderr,none": 0.020374439597383796, "f1,none": 0.4396227279523235, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_zul": { "alias": "afrixnli_en_direct_zul", "acc,none": 0.5433333333333333, "acc_stderr,none": 0.020352577627018392, "f1,none": 0.4400411624098575, "f1_stderr,none": "N/A" } }, "afrimgsm_direct_xho": { "alias": "afrimgsm_direct_xho", "exact_match,remove_whitespace": 0.004, "exact_match_stderr,remove_whitespace": 0.004000000000000003, "exact_match,flexible-extract": 0.044, "exact_match_stderr,flexible-extract": 0.012997373846574952 }, "afrimgsm_direct_zul": { "alias": "afrimgsm_direct_zul", "exact_match,remove_whitespace": 0.0, "exact_match_stderr,remove_whitespace": 0.0, "exact_match,flexible-extract": 0.02, "exact_match_stderr,flexible-extract": 0.008872139507342683 }, "afrimmlu_direct_xho": { "alias": "afrimmlu_direct_xho", "acc,none": 0.316, "acc_stderr,none": 0.020812359515855857, "f1,none": 0.3121412403731796, "f1_stderr,none": "N/A" }, "afrimmlu_direct_zul": { "alias": "afrimmlu_direct_zul", "acc,none": 0.298, "acc_stderr,none": 0.02047511809298895, "f1,none": 0.30077002468766567, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_xho": { "alias": "afrixnli_en_direct_xho", "acc,none": 0.5366666666666666, "acc_stderr,none": 0.020374439597383796, "f1,none": 0.4396227279523235, "f1_stderr,none": "N/A" }, "afrixnli_en_direct_zul": { "alias": "afrixnli_en_direct_zul", "acc,none": 0.5433333333333333, "acc_stderr,none": 0.020352577627018392, "f1,none": 0.4400411624098575, "f1_stderr,none": "N/A" } } ## 数据集详情 ### 数据集描述  - **策展方：** [信息待补充] - **资助方（可选）：** [信息待补充] - **共享方（可选）：** [信息待补充] - **语言（自然语言处理）：** [信息待补充] - **许可证：** [信息待补充] ### 数据集来源（可选）  - **仓库：** [信息待补充] - **论文（可选）：** [信息待补充] - **演示（可选）：** [信息待补充] ## 用途  ### 直接用途  [信息待补充] ### 超出范围的用途  [信息待补充] ## 数据集结构  [信息待补充] ## 数据集创建 ### 策展理由  [信息待补充] ### 源数据  #### 数据收集与处理  [信息待补充] #### 源数据生产者是谁？  [信息待补充] ### 标注（可选）  #### 标注流程  [信息待补充] #### 标注者是谁？  [信息待补充] #### 个人及敏感信息  [信息待补充] ## 偏差、风险与局限性  [信息待补充] ### 建议  用户应了解数据集的风险、偏差和局限性。需更多信息以提供进一步建议。 ## 引用（可选）  **BibTeX：** [信息待补充] **APA：** [信息待补充] ## 术语表（可选）  [信息待补充] ## 更多信息（可选） [信息待补充] ## 数据集卡片作者（可选） [信息待补充] ## 数据集卡片联系人 [信息待补充]

应用场景：