BanHealthAIRespRelv3: A Benchmark Dataset for Evaluating Bengali AI-Generated Health Responses Based on User-Specific Prompts

Name: BanHealthAIRespRelv3: A Benchmark Dataset for Evaluating Bengali AI-Generated Health Responses Based on User-Specific Prompts
Creator: Mendeley Data
Published: 2026-04-23 20:44:12
License: 暂无描述

DataCite Commons2026-04-23 更新2026-05-04 收录

下载链接：

https://data.mendeley.com/datasets/735b74kwbx/2

下载链接

链接失效反馈

官方服务：

资源简介：

BanHealthAIRespRelv3 serves as a benchmark dataset for evaluating the relevance and safety of AI-generated health responses in Bengali, a low-resource language. Its primary goal is to address risks of medical hallucinations, irrelevant advice, and misinformation in online health information seeking (OHIS), particularly in resource-constrained settings like Bangladesh. To ensure high quality, the dataset underwent rigorous preprocessing (text cleaning, duplicate removal, filtering incomplete responses) and annotation by three native Bengali speakers, with reliability confirmed via Fleiss’ Kappa (substantial to almost perfect agreement) and retention of only high-confidence instances (>0.8). BanHealthAIRespRelv3 is a ternary-class dataset comprising 15,384 instances of user-simulated health prompts and corresponding AI responses (from ChatGPT and Google Gemini), categorized as: Highly Relevant: 5,004 instances Partially Relevant: 5,300 instances Not Relevant: 5,084 instances The dataset has broad applications in multiple NLP and AI areas, including: Relevance and hallucination detection in health AI Development of safer, culturally sensitive health chatbots Trust calibration and ethical AI in digital health Misinformation mitigation in low-resource languages Educational and research advancements in Bengali medical NLP BanHealthAIRespRelv3 is openly available for academic and research purposes, promoting collaboration and innovation in Bengali NLP. By establishing a robust benchmark, it aims to foster trustworthy, context-aware AI systems for health-related applications in under-resourced languages.

BanHealthAIRespRelv3是一款用于评估孟加拉语（低资源语言）下AI生成健康回复相关性与安全性的基准数据集。其核心目标在于解决在线健康信息检索（Online Health Information Seeking，OHIS）场景中存在的医学幻觉、无关建议与错误信息风险，尤其针对孟加拉国这类资源受限地区。为保障数据集质量，该数据集经过了严格的预处理流程（文本清洗、去重、过滤不完整回复），并由3名母语为孟加拉语的标注人员完成标注。通过弗莱伊斯κ系数（Fleiss’ Kappa）验证了标注可靠性（一致性为显著至近乎完美），且仅保留置信度大于0.8的样本。 BanHealthAIRespRelv3为三元分类数据集，共包含15384条用户模拟健康查询与对应AI回复（源自ChatGPT与Google Gemini）样本，分为以下三类：高度相关：5004条样本部分相关：5300条样本不相关：5084条样本该数据集可广泛应用于多项自然语言处理（Natural Language Processing，NLP）与人工智能（Artificial Intelligence，AI）领域，具体包括：健康AI领域的相关性与幻觉检测安全且贴合文化语境的健康聊天机器人开发数字健康中的信任校准与伦理AI研究低资源语言场景下的错误信息治理孟加拉语医学自然语言处理的教育与研究进展 BanHealthAIRespRelv3可公开用于学术与研究用途，旨在推动孟加拉语自然语言处理领域的合作与创新。通过构建这一高质量基准数据集，其目标是为资源匮乏语言的健康相关应用打造可信、感知上下文的人工智能系统。

提供机构：

Mendeley Data

创建时间：

2026-04-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集