International Organization Evaluation Report Dataset (IOEval) and replication data for ‘The Performance of International Organizations: A New Measure and Dataset Based on Computational Text Analysis of Evaluation Reports, Review of International Organizations, DOI: 10.1007/s11558-023-09489-1.’
收藏Mendeley Data2024-03-27 更新2024-06-27 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/0SI2VX
下载链接
链接失效反馈官方服务:
资源简介:
International Organization Evaluation Report Dataset (IOEval) and replication data for ‘The Performance of International Organizations: A New Measure and Dataset Based on Computational Text Analysis of Evaluation Reports, Review of International Organizations, DOI: 10.1007/s11558-023-09489-1.’ Eckhard, Steffen; Jankauskas, Vytautas; Leuschner, Elena; Burton, Ian; Kerl, Tilman; Sevastjanova, Rita, 2023, "International Organization Evaluation Report Dataset (IOEval) and replication data for ‘The Performance of International Organizations: A New Measure and Dataset Based on Computational Text Analysis of Evaluation Reports’", https://doi.org/10.7910/DVN/0SI2VX, Harvard Dataverse, V1, UNF:6:fBGGclS7HUPoO8PEGwGFZg== [fileUNF] This dataset contains: • the sentence-level text of 1,082 evaluation reports published by nine international organizations of the United Nations (UN) system between 2012 to 2021); • a fine-tuned BERT language model that allows classifying individual sentences in evaluation reports as containing a positive, negative or neutral assessment of the evaluated activity; • and replication files for our publication DOI: 10.1007/s11558-023-09489-1. When using the data, please cite: “Eckhard, Steffen; Jankauskas, Vytautas; Leuschner, Elena; Burton, Ian; Kerl, Tilman; Sevastjanova, Rita (2023). The Performance of International Organizations: A New Measure and Dataset Based on Computational Text Analysis of Evaluation Reports. Review of International Organizations, DOI: 10.1007/s11558-023-09489-1.” Summary of the IOEval Dataset: The IOEval dataset contains the sentence-level text of 1,082 evaluation reports published by nine international organizations of the United Nations (UN) system between 2012 to 2021. Raw text was cleaned by applying standard procedures of natural language processing (e.g., removal of special characters and numbers) and split into sentences. The text is taken from evaluation reports by International Labor Organization (ILO), the UN Development Program (UNDP), the UN International Children's Emergency Fund (UNICEF), the Food and Agricultural Organization (FAO), the UN Educational, Scientific and Cultural Organization (UNESCO), the World Health Organization (WHO), the International Organization for Migration (IOM), the UN High Commissioner for Refugees (UNHCR) and the UN Entity for Gender Equality and the Empowerment of Women (UN WOMEN). At a sentence level, the dataset specifies to which text section a sentence belongs (executive summary, main text, appendix). The IOEval dataset also includes metadata variables at the level of reports: report title, publication date, evaluation type (project, program, institutional or thematic), evaluation level (country (specifying its name), regional, global), and commissioning unit (centralized or decentralized). Summary of language model: The fine-tuned BERT language model (Devlin et al., 2019) allows classifying individual sentences in evaluation reports as containing a positive, negative or neutral assessment of the evaluated activity. It was fine-tuned and evaluated on around 10,000 hand-coded sentences from evaluation reports, reaching a recall of 89 percent.
本数据集为国际组织评估报告数据集(International Organization Evaluation Report Dataset, IOEval)及其配套复现数据,相关研究论文为《The Performance of International Organizations: A New Measure and Dataset Based on Computational Text Analysis of Evaluation Reports》,发表于《国际组织评论(Review of International Organizations)》,DOI: 10.1007/s11558-023-09489-1。相关作者为Eckhard, Steffen; Jankauskas, Vytautas; Leuschner, Elena; Burton, Ian; Kerl, Tilman; Sevastjanova, Rita,数据集版本为2023年发布的V1,存储于哈佛数据文库(Harvard Dataverse),链接为https://doi.org/10.7910/DVN/0SI2VX,唯一文件标识符为UNF:6:fBGGclS7HUPoO8PEGwGFZg== [fileUNF]。
本数据集涵盖以下内容:
1. 2012年至2021年间联合国(United Nations, UN)系统内9个国际组织发布的1082份评估报告的句级文本;
2. 一个经微调的BERT语言模型(Devlin等人,2019),可将评估报告中的单个句子分类为对评估对象活动持有正面、负面或中性评价的文本;
3. 对应上述发表论文的复现文件,其DOI为10.1007/s11558-023-09489-1。
使用本数据集时请引用如下文献:"Eckhard, Steffen; Jankauskas, Vytautas; Leuschner, Elena; Burton, Ian; Kerl, Tilman; Sevastjanova, Rita (2023). The Performance of International Organizations: A New Measure and Dataset Based on Computational Text Analysis of Evaluation Reports. Review of International Organizations, DOI: 10.1007/s11558-023-09489-1."
IOEval数据集概述:
IOEval数据集包含2012年至2021年间联合国系统内9个国际组织发布的1082份评估报告的句级文本。原始文本已通过标准自然语言处理(Natural Language Processing, NLP)流程进行清洗(例如移除特殊字符与数字)并分句。文本来源包括:国际劳工组织(International Labor Organization, ILO)、联合国开发计划署(United Nations Development Programme, UNDP)、联合国儿童基金会(United Nations International Children's Emergency Fund, UNICEF)、粮食及农业组织(Food and Agricultural Organization, FAO)、联合国教育、科学及文化组织(United Nations Educational, Scientific and Cultural Organization, UNESCO)、世界卫生组织(World Health Organization, WHO)、国际移民组织(International Organization for Migration, IOM)、联合国难民事务高级专员公署(United Nations High Commissioner for Refugees, UNHCR)以及联合国促进性别平等和增强妇女权能署(简称联合国妇女署,UN WOMEN)。
在句子层面,数据集标注了每个句子所属的文本章节(执行摘要、正文、附录)。此外,IOEval数据集还包含报告级元数据变量:报告标题、发布日期、评估类型(项目、方案、机构或专题)、评估层级(国家(需标注国家名称)、区域、全球)以及委托执行单位(集中式或分散式)。
语言模型概述:
经微调的BERT语言模型(Devlin等人,2019)可将评估报告中的单个句子分类为对评估对象活动持有正面、负面或中性评价的文本。该模型基于约10000条人工标注的评估报告句子进行微调与评估,召回率达到89%。
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集收录了2012至2021年间联合国系统下九个国际组织的1,082份评估报告的句子级文本数据,并提供了一个微调的BERT模型用于情感分类。数据集还包含丰富的元数据,如报告标题、评估类型和级别等,适用于国际组织绩效的文本分析研究。
以上内容由遇见数据集搜集并总结生成



