Simulated Data for Patient Time Series Record Linkage
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Simulated_Data_for_Patient_Time_Series_Record_Linkage/19224786
下载链接
链接失效反馈官方服务:
资源简介:
This simulated dataset constitutes two files (after decompression), namely: sim_ergo_1600.csv and sim_pat_1600.csv.
1. ergo.csv contains heart rate timeseries data for 1600 patients' ergometric tests. For each patient, 20 different ergometric tests were simulated. Each row in this file constitutes three field values: Ergo_ID, Heart Rate (BPM), and timestamp.
2. pat.csv contains only four sample readings from each of the patient's 20 ergometric tests. Each row contains three values: patient_ID, Heart Rate, and timestamp.
The goal is to link patients (identified by their patient_ID in the pat.csv file) to their corresponding ergometric tests (identified by their Ergo_ID in the ergo.csv file). This is done solely on matching the timestamp-value pairs from both files.
The timeseries record linkage task described above is efficiently accomplished by the proposed tslink2 algorithm. tslink2 is implemented in C++ and is publicly availabe at https://github.com/ahmsoliman/tslink2
Data is simulated such that correctly linked/matched identifiers follow the following formula:
|Ergo_ID - patient_ID| mod 104 == 0
The above formula is useful in evaluating the linkage algorithm performance.
创建时间:
2022-02-23



