five

NLP-FBK/multilingual-medical-reasoning-traces

收藏
Hugging Face2025-11-27 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/NLP-FBK/multilingual-medical-reasoning-traces
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: en features: - name: id dtype: string - name: options struct: - name: '1' dtype: string - name: '2' dtype: string - name: '3' dtype: string - name: '4' dtype: string - name: correct_option dtype: int64 - name: full_question dtype: string - name: similar_chunks_dense list: - name: chunk_id dtype: int64 - name: similarity_score dtype: float64 - name: text dtype: string - name: formatted_similar_chunks_dense dtype: string - name: reasoning dtype: string - name: list_of_options sequence: string - name: reasoning_parsed_answer dtype: int64 splits: - name: medmcqa num_bytes: 6010411438 num_examples: 169098 - name: medqa num_bytes: 341749007 num_examples: 9520 download_size: 2961550979 dataset_size: 6352160445 - config_name: es features: - name: id dtype: string - name: options struct: - name: '1' dtype: string - name: '2' dtype: string - name: '3' dtype: string - name: '4' dtype: string - name: correct_option dtype: int64 - name: full_question dtype: string - name: similar_chunks_dense list: - name: chunk_id dtype: int64 - name: similarity_score dtype: float64 - name: text dtype: string - name: formatted_similar_chunks_dense dtype: string - name: reasoning dtype: string - name: list_of_options sequence: string - name: reasoning_parsed_answer dtype: int64 splits: - name: medmcqa num_bytes: 4841465675 num_examples: 168771 - name: medqa num_bytes: 298177516 num_examples: 9584 download_size: 2671932858 dataset_size: 5139643191 - config_name: it features: - name: id dtype: string - name: options struct: - name: '1' dtype: string - name: '2' dtype: string - name: '3' dtype: string - name: '4' dtype: string - name: correct_option dtype: int64 - name: full_question dtype: string - name: similar_chunks_dense list: - name: chunk_id dtype: int64 - name: similarity_score dtype: float64 - name: text dtype: string - name: formatted_similar_chunks_dense dtype: string - name: reasoning dtype: string - name: list_of_options sequence: string - name: reasoning_parsed_answer dtype: int64 splits: - name: medmcqa num_bytes: 4736599523 num_examples: 166257 - name: medqa num_bytes: 289611823 num_examples: 9468 download_size: 2687166328 dataset_size: 5026211346 configs: - config_name: en data_files: - split: medmcqa path: en/medmcqa-* - split: medqa path: en/medqa-* - config_name: es data_files: - split: medmcqa path: es/medmcqa-* - split: medqa path: es/medqa-* - config_name: it data_files: - split: medmcqa path: it/medmcqa-* - split: medqa path: it/medqa-* --- This datasets containes the traces generated to answer multiple-choice medical questions in Italian, Englihs, and Spanish. The dataset is structured in 3 parts, one per language. Each part is composed by 2 splits, one containing the examples generated from `medqa`, one from `medmcqa`. The columns are: - `id`, representing an unique identifier - `full_question`, representing the medical question - `options`, a dictionary of options to answer the question and their identifiers - `list_of_options`, a list of the `options` without the identifier - `correct_option`, the identifier of the correct option to answer the `full_question` - `similar_chunks_dense`, a list of chunks retireved from the Wikipedia knowledge base that are relevant to answer the `full_question` - `formatted_similar_chunks_dense`, a refined version of the `similar_chunks_dense` that is actually used to help models answering the question - `reasoning`, the answer to the `full_question` generated prompting Qwen3-32B, giving as context `formatted_similar_chunks_dense`
提供机构:
NLP-FBK
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作