NLP-FBK/adapt-sllm-italian-medical-tasks-CP-data
收藏Hugging Face2026-04-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/NLP-FBK/adapt-sllm-italian-medical-tasks-CP-data
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: clinical
features:
- name: chunk
dtype: string
- name: source_file
dtype: string
- name: chunk_type
dtype: string
- name: language
dtype: string
- name: id
dtype: int64
- name: chunk_is_cleaned_status
dtype: string
- name: chunk_is_cleaned_reason
dtype: string
- name: n_words_chunk
dtype: int64
splits:
- name: emergency_dept
num_bytes: 1286930515
num_examples: 1972254
download_size: 350967503
dataset_size: 1286930515
- config_name: scientific
features:
- name: chunk
dtype: string
- name: source_file
dtype: string
- name: chunk_type
dtype: string
- name: language
dtype: string
- name: id
dtype: int64
- name: chunk_is_cleaned_status
dtype: string
- name: chunk_is_cleaned_reason
dtype: string
- name: n_words_chunk
dtype: int64
splits:
- name: commoncrawl_med
num_bytes: 471485471
num_examples: 132974
- name: drug_instructions
num_bytes: 274724988
num_examples: 23452
- name: wikipedia
num_bytes: 93115589
num_examples: 26273
- name: e3c
num_bytes: 86989387
num_examples: 10473
- name: web_hose_az
num_bytes: 48467949
num_examples: 24834
- name: thesis
num_bytes: 49823818
num_examples: 104477
- name: pubmed
num_bytes: 17550382
num_examples: 7050
- name: supplement_description
num_bytes: 10117903
num_examples: 5294
- name: medical_websites
num_bytes: 28501144
num_examples: 10169
- name: others
num_bytes: 7508179
num_examples: 1198
- name: unipd_thesis
num_bytes: 235713850
num_examples: 535689
download_size: 606485469
dataset_size: 1323998660
configs:
- config_name: clinical
data_files:
- split: emergency_dept
path: clinical/emergency_dept-*
- config_name: scientific
data_files:
- split: commoncrawl_med
path: scientific/commoncrawl_med-*
- split: drug_instructions
path: scientific/drug_instructions-*
- split: wikipedia
path: scientific/wikipedia-*
- split: e3c
path: scientific/e3c-*
- split: web_hose_az
path: scientific/web_hose_az-*
- split: thesis
path: scientific/thesis-*
- split: pubmed
path: scientific/pubmed-*
- split: supplement_description
path: scientific/supplement_description-*
- split: medical_websites
path: scientific/medical_websites-*
- split: others
path: scientific/others-*
- split: unipd_thesis
path: scientific/unipd_thesis-*
---
The usage of this dataset is subject to a CCBY licence
提供机构:
NLP-FBK



