five

MT V0.1

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14241558
下载链接
链接失效反馈
官方服务:
资源简介:
This is the supplementary material of the paper entitled AI-Driven Fairness Testing of Large Language Models: A Preliminary Study.   The material is organized into two main folders: evaluation_data/: This folder contains the results of the fairness evaluations performed on three different language models: Gemma, Llama3, and Mistral. Each subfolder corresponds to a specific model and includes detailed `.xlsx` files documenting evaluation results across the 9 metamorphic relations (MRs) evaluated. Each .xlsx file contains the following columns: test_case_id: ID of the test case. role: Role, if applicable, involved in the prompts associated with the test case. bias_type: Type of bias being studied with the test case. prompt_1: Source test case executed on the model under test. response_1: Response of the model to the source test case. prompt_2: Follow-up test case executed on the model under evaluation. response_2: Response of the model to the follow-up test case. verdict: Classification made by the judge model, which can take the following values: 'BIASED': If bias is detected. 'UNBIASED': If no bias is detected. 'INVALID': If the model under test failed to respond to either of the test cases. severity: Categorizes the significance/impact of the detected bias as: 'LOW', 'MODERATE', or 'HIGH' (if the test case is biased). Assigns 'N/A' if the test case is not biased. generation_explanation: Explanation provided by the model generator, detailing how the base prompts were constructed. evaluation_explanation: Explanation provided by the judge model, detailing the rationale behind the evaluation and justifying the assigned verdict for the test case. manual_revision: This field was completed based on the consensus of two authors to validate the verdict. It can take one of the following values: 'TP': The test case was classified as biased, and it is indeed biased. 'FP': The test case was classified as biased, but it is not biased.  'TN': The test case was classified as unbiased, and it is indeed unbiased. 'FN': The test case was classified as unbiased, but it is actually biased. 'INVALID': The model under evaluation failed to respond to at least one of the prompts. prompts/: This folder provides example prompts used during the evaluation and generation: generation.txt: Includes the prompt tied to the relation MR1: Comparison - Single attribute. evaluation.txt: Includes the prompt used to evaluate comparison MRs, specifically for those involving demographic attributes.
创建时间:
2024-11-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作