five

MedBanglaTrust3: A Bengali Dataset for Explainable and Trustworthy AI Health Suggestions

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/r3vkg55nbc
下载链接
链接失效反馈
官方服务:
资源简介:
MedBanglaTrust3 is a curated, expert-validated dataset developed to facilitate machine learning, deep learning, and natural language processing (NLP) tasks focused on evaluating the trustworthiness of AI-generated health suggestions in the Bengali (Bangla) language. This dataset is specifically tailored for low-resource language modeling and is particularly relevant in the context of cyberchondria, where users excessively rely on online health advice without clinical verification. It contains symptom-specific prompts along with responses from OpenAI's ChatGPT and Google's GEMINI search-generated summaries, manually labeled into three trustworthiness levels: Highly Relevant, Partially Relevant, and Not Relevant. The dataset supports the development of explainable AI (XAI) systems, text classification models, and context-aware AI assistants for healthcare use in underrepresented languages. Objective: To enable research and development of trust classification models, automated health dialogue systems, and responsible AI assistants by providing labeled, real-world AI-generated responses in Bangla that reflect varying degrees of medical relevance and accuracy. Dataset Composition: - Language: Bengali (Bangla) - Final Validated Instances: 6,660 Class Distribution (Balanced): 1. Highly Relevant – 2,220 responses 2. Partially Relevant – 2,220 responses 3. Not Relevant – 2,220 responses
创建时间:
2025-07-25
二维码
社区交流群
二维码
科研交流群
商业服务