five

recogna-nlp/bulas_qa

收藏
Hugging Face2026-03-16 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/recogna-nlp/bulas_qa
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - question-answering language: - pt tags: - medical - llm - portuguese - benchmark pretty_name: qa_bulas size_categories: - n<1K --- # Medication-Specific QA Benchmark ## Dataset Details To construct a controlled evaluation benchmark, we selected 25 widely prescribed medications in Brazil across four therapeutic categories: antibiotics, analgesics and anti-inflammatory agents, antihypertensives, and antidiabetics. These categories were chosen to ensure clinical diversity across infectious, inflammatory, cardiovascular, and metabolic conditions. For each selected medication, we verified the presence of its corresponding leaflet in the cleaned corpus. We then identified 10 standardized sections consistently present across documents, including indications, dosage, contraindications, warnings, adverse reactions, and drug interactions. Question generation was performed using a large language model conditioned on the medication name and structured leaflet content. To ensure balanced coverage and section-level control, we generated: - 4 questions per medication per section; - 40 questions per section across medications; - 400 total open-ended questions. This benchmark, referred to as the Medication-Specific QA Benchmark, is explicitly designed to measure factual recall, section-level grounding, and evidence attribution when answers are directly localized within regulatory documents. ## Citation This work was accepted at **The First Workshop on Language Technologies for Health (Lang4Health)** is a workshop dedicated to the development and application of Natural Language Processing (NLP) technologies in the healthcare field.
提供机构:
recogna-nlp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作