Data from: Attacks on genetic privacy via uploads to genealogical databases
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.25338/B8X619
下载链接
链接失效反馈官方服务:
资源简介:
Direct-to-consumer (DTC) genetics services are increasingly popular for
genetic genealogy, with tens of millions of customers as of 2019. Several
DTC genealogy services allow users to upload their own genetic datasets in
order to search for genetic relatives. The statement that a user's
uploaded genome shares one or more segments in common with that of a
target person in the database---that is, that the two genomes share one or
more regions identical by state (IBS)---reveals some information about the
genotypes of the target person, particularly if the chromosomal locations
of IBS matches are shared with the uploader. Here, we describe several
methods by which an adversary who wants to learn the genotypes of people
in the database can do so by uploading multiple datasets. Depending on the
methods used for IBS matching and the information about IBS segments
returned to the user, substantial information about users' genotypes
can be revealed with a few hundred uploaded datasets. For example, using a
method we call IBS tiling, we estimate that an adversary who uploads
approximately 900 publicly available genomes could recover at least one
allele at SNP sites across up to 82% of the genome of a median person of
European ancestries. In databases that detect IBS segments using unphased
genotypes, approximately 100 uploads of falsified datasets can reveal
enough genetic information to allow accurate genome-wide imputation of
every person in the database. We provide simple-to-implement suggestions
that will prevent the exploits we describe and discuss our results in
light of recent trends in genetic privacy, including the recent use of
uploads to DTC genetic genealogy services by law enforcement.
提供机构:
Dryad
创建时间:
2020-07-23



