five

Network traffic from each client to the server.

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Network_traffic_from_each_client_to_the_server_/26366604
下载链接
链接失效反馈
官方服务:
资源简介:
Increasing genetic and phenotypic data size is critical for understanding the genetic determinants of diseases. Evidently, establishing practical means for collaboration and data sharing among institutions is a fundamental methodological barrier for performing high-powered studies. As the sample sizes become more heterogeneous, complex statistical approaches, such as generalized linear mixed effects models, must be used to correct for the confounders that may bias results. On another front, due to the privacy concerns around Protected Health Information (PHI), genetic information is restrictively protected by sharing according to regulations such as Health Insurance Portability and Accountability Act (HIPAA). This limits data sharing among institutions and hampers efforts around executing high-powered collaborative studies. Federated approaches are promising to alleviate the issues around privacy and performance, since sensitive data never leaves the local sites. Motivated by these, we developed FedGMMAT, a federated genetic association testing tool that utilizes a federated statistical testing approach for efficient association tests that can correct for confounding fixed and additive polygenic random effects among different collaborating sites. Genetic data is never shared among collaborating sites, and the intermediate statistics are protected by encryption. Using simulated and real datasets, we demonstrate FedGMMAT can achieve the virtually same results as pooled analysis under a privacy-preserving framework with practical resource requirements.

扩大遗传与表型数据规模,对于解析疾病的遗传决定因素至关重要。显然,构建机构间协作与数据共享的可行机制,是开展高效力研究的核心方法学壁垒。随着样本异质性不断提升,需采用广义线性混合效应模型(generalized linear mixed effects models)等复杂统计方法,校正可能引发结果偏倚的混杂因素。另一方面,受保护健康信息(Protected Health Information, PHI)存在隐私安全隐患,因此遗传信息的共享需遵循《健康保险流通与责任法案》(Health Insurance Portability and Accountability Act, HIPAA)等法规,受到严格管控。这限制了机构间的数据共享,阻碍了高效力协作研究的推进。联邦式方案有望缓解隐私保护与计算性能层面的难题,因为敏感数据无需离开本地站点。基于上述背景,我们研发了FedGMMAT——一款联邦式遗传关联测试工具,其采用联邦统计测试方法开展高效关联测试,能够校正不同协作站点间存在的固定效应与加性多基因随机效应混杂因素。协作站点间不会共享遗传数据,中间统计量均通过加密手段得到保护。通过模拟数据集与真实数据集的测试验证,我们证实FedGMMAT在兼顾实际资源开销的隐私保护框架下,可取得与合并分析几乎完全一致的结果。
创建时间:
2024-07-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作