MedBanglaTrust3: A Bengali Dataset for Explainable and Trustworthy AI Health Suggestions
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/r3vkg55nbc
下载链接
链接失效反馈官方服务:
资源简介:
MedBanglaTrust3 is a curated, expert-validated dataset developed to facilitate machine learning, deep learning, and natural language processing (NLP) tasks focused on evaluating the trustworthiness of AI-generated health suggestions in the Bengali (Bangla) language. This dataset is specifically tailored for low-resource language modeling and is particularly relevant in the context of cyberchondria, where users excessively rely on online health advice without clinical verification. It contains symptom-specific prompts along with responses from OpenAI's ChatGPT and Google's GEMINI search-generated summaries, manually labeled into three trustworthiness levels: Highly Relevant, Partially Relevant, and Not Relevant. The dataset supports the development of explainable AI (XAI) systems, text classification models, and context-aware AI assistants for healthcare use in underrepresented languages.
Objective:
To enable research and development of trust classification models, automated health dialogue systems, and responsible AI assistants by providing labeled, real-world AI-generated responses in Bangla that reflect varying degrees of medical relevance and accuracy.
Dataset Composition:
- Language: Bengali (Bangla)
- Final Validated Instances: 6,660
Class Distribution (Balanced):
1. Highly Relevant – 2,220 responses
2. Partially Relevant – 2,220 responses
3. Not Relevant – 2,220 responses
创建时间:
2025-07-25



