recogna-nlp/bulas_qa

Name: recogna-nlp/bulas_qa
Creator: recogna-nlp
Published: 2026-03-16 19:17:26
License: 暂无描述

Hugging Face2026-03-16 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/recogna-nlp/bulas_qa

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - question-answering language: - pt tags: - medical - llm - portuguese - benchmark pretty_name: qa_bulas size_categories: - n<1K --- # Medication-Specific QA Benchmark ## Dataset Details To construct a controlled evaluation benchmark, we selected 25 widely prescribed medications in Brazil across four therapeutic categories: antibiotics, analgesics and anti-inflammatory agents, antihypertensives, and antidiabetics. These categories were chosen to ensure clinical diversity across infectious, inflammatory, cardiovascular, and metabolic conditions. For each selected medication, we verified the presence of its corresponding leaflet in the cleaned corpus. We then identified 10 standardized sections consistently present across documents, including indications, dosage, contraindications, warnings, adverse reactions, and drug interactions. Question generation was performed using a large language model conditioned on the medication name and structured leaflet content. To ensure balanced coverage and section-level control, we generated: - 4 questions per medication per section; - 40 questions per section across medications; - 400 total open-ended questions. This benchmark, referred to as the Medication-Specific QA Benchmark, is explicitly designed to measure factual recall, section-level grounding, and evidence attribution when answers are directly localized within regulatory documents. ## Citation This work was accepted at **The First Workshop on Language Technologies for Health (Lang4Health)** is a workshop dedicated to the development and application of Natural Language Processing (NLP) technologies in the healthcare field.

提供机构：

recogna-nlp

5,000+

优质数据集

54 个

任务类型

进入经典数据集