justpyschitry/autotrain-data-Wikipeida_Article_Classifier_by_Chap
收藏AutoTrain Dataset for project: Wikipeida_Article_Classifier_by_Chap
数据集描述
该数据集由AutoTrain自动处理,用于项目Wikipeida_Article_Classifier_by_Chap。
语言
数据集的语言BCP-47代码为en。
数据集结构
数据实例
数据集的一个样本如下:
json [ { "text": "diffuse actinic keratinocyte dysplasia", "target": 15 }, { "text": "cholesterol atheroembolism", "target": 8 } ]
数据集字段
数据集包含以下字段(也称为“特征”):
json { "text": "Value(dtype=string, id=None)", "target": "ClassLabel(num_classes=20, names=[Certain infectious or parasitic diseases, Developmental anaomalies, Diseases of the blood or blood forming organs, Diseases of the genitourinary system, Mental behavioural or neurodevelopmental disorders, Neoplasms, certain conditions originating in the perinatal period, conditions related to sexual health, diseases of the circulatroy system, diseases of the digestive system, diseases of the ear or mastoid process, diseases of the immune system, diseases of the musculoskeletal system or connective tissue, diseases of the nervous system, diseases of the respiratory system, diseases of the skin, diseases of the visual system, endocrine nutritional or metabolic diseases, pregnanacy childbirth or the puerperium, sleep-wake disorders], id=None)" }
数据集分割
数据集被分为训练集和验证集。分割大小如下:
| 分割名称 | 样本数量 |
|---|---|
| train | 9828 |
| valid | 2468 |



