MT V0.1

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/14241558

下载链接

链接失效反馈

官方服务：

资源简介：

This is the supplementary material of the paper entitled AI-Driven Fairness Testing of Large Language Models: A Preliminary Study. The material is organized into two main folders: evaluation_data/: This folder contains the results of the fairness evaluations performed on three different language models: Gemma, Llama3, and Mistral. Each subfolder corresponds to a specific model and includes detailed `.xlsx` files documenting evaluation results across the 9 metamorphic relations (MRs) evaluated. Each .xlsx file contains the following columns: test_case_id: ID of the test case. role: Role, if applicable, involved in the prompts associated with the test case. bias_type: Type of bias being studied with the test case. prompt_1: Source test case executed on the model under test. response_1: Response of the model to the source test case. prompt_2: Follow-up test case executed on the model under evaluation. response_2: Response of the model to the follow-up test case. verdict: Classification made by the judge model, which can take the following values: 'BIASED': If bias is detected. 'UNBIASED': If no bias is detected. 'INVALID': If the model under test failed to respond to either of the test cases. severity: Categorizes the significance/impact of the detected bias as: 'LOW', 'MODERATE', or 'HIGH' (if the test case is biased). Assigns 'N/A' if the test case is not biased. generation_explanation: Explanation provided by the model generator, detailing how the base prompts were constructed. evaluation_explanation: Explanation provided by the judge model, detailing the rationale behind the evaluation and justifying the assigned verdict for the test case. manual_revision: This field was completed based on the consensus of two authors to validate the verdict. It can take one of the following values: 'TP': The test case was classified as biased, and it is indeed biased. 'FP': The test case was classified as biased, but it is not biased. 'TN': The test case was classified as unbiased, and it is indeed unbiased. 'FN': The test case was classified as unbiased, but it is actually biased. 'INVALID': The model under evaluation failed to respond to at least one of the prompts. prompts/: This folder provides example prompts used during the evaluation and generation: generation.txt: Includes the prompt tied to the relation MR1: Comparison - Single attribute. evaluation.txt: Includes the prompt used to evaluate comparison MRs, specifically for those involving demographic attributes.

创建时间：

2024-11-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集