five

Cirrhosis_Patient_Survival_Prediction

收藏
OpenML2025-04-13 更新2025-12-20 收录
下载链接:
https://www.openml.org/search?type=data&sort=runs&status=active&id=46843
下载链接
链接失效反馈
官方服务:
资源简介:
For what purpose was the dataset created? Cirrhosis results from prolonged liver damage, leading to extensive scarring, often due to conditions like hepatitis or chronic alcohol consumption. The data provided is sourced from a Mayo Clinic study on primary biliary cirrhosis (PBC) of the liver carried out from 1974 to 1984. Who funded the creation of the dataset? Mayo Clinic What do the instances in this dataset represent? People Does the dataset contain data that might be considered sensitive in any way? Gender, Age Was there any data preprocessing performed? 1. Drop all the rows where miss value (NA) were present in the Drug column 2. Impute missing values with mean results 3. One-hot encoding for all category attributes Additional Information During 1974 to 1984, 424 PBC patients referred to the Mayo Clinic qualified for the randomized placebo-controlled trial testing the drug D-penicillamine. Of these, the initial 312 patients took part in the trial and have mostly comprehensive data. The remaining 112 patients didn't join the clinical trial but agreed to record basic metrics and undergo survival tracking. Six of these patients were soon untraceable after their diagnosis, leaving data for 106 of these individuals in addition to the 312 who were part of the randomized trial. 1. ID: unique identifier 2. N_Days: number of days between registration and the earlier of death, transplantation, or study analysis time in July 1986 3. Status: status of the patient C (censored), CL (censored due to liver tx), or D (death) 4. Drug: type of drug D-penicillamine or placebo 5. Age: age in [days] 6. Sex: M (male) or F (female) 7. Ascites: presence of ascites N (No) or Y (Yes) 8. Hepatomegaly: presence of hepatomegaly N (No) or Y (Yes) 9. Spiders: presence of spiders N (No) or Y (Yes) 10. Edema: presence of edema N (no edema and no diuretic therapy for edema), S (edema present without diuretics, or edema resolved by diuretics), or Y (edema despite diuretic therapy) 11. Bilirubin: serum bilirubin in [mg/dl] 12. Cholesterol: serum cholesterol in [mg/dl] 13. Albumin: albumin in [gm/dl] 14. Copper: urine copper in [ug/day] 15. Alk_Phos: alkaline phosphatase in [U/liter] 16. SGOT: SGOT in [U/ml] 17. Triglycerides: triglicerides in [mg/dl] 18. Platelets: platelets per cubic [ml/1000] 19. Prothrombin: prothrombin time in seconds [s] 20. Stage: histologic stage of disease (1, 2, 3, or 4) Class Labels Status: status of the patient 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation)
创建时间:
2025-04-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作