five

7bgzf case study dataset for x64 machine

收藏
DataCite Commons2020-08-27 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/7bgzf_case_study_dataset_for_x64_machine/8063117/1
下载链接
链接失效反馈
官方服务:
资源简介:
7bgzf case study datasets for the x64 machine<br>The included files are datasets from the UCSC Genome Browser, the 1000 Genomes Project, and Ensembl, which have been prepared as test corpuses for deflation. .bcf files have been converted from vcf.gz files using the `bcftools view -Ob` with bcftools 1.9. Direct links to the original datasets are below.<br>http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/wgEncodeCaltechRnaSeqGm12878R1x75dAlignsRep1V2.bam<br>http://hgdownload.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeUwRnaSeq/wgEncodeUwRnaSeqThymusCellPolyaMAdult8wksC57bl6AlnRep1.bam<br>http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz<br>http://ftp.ensembl.org/pub/release-93/variation/vcf/mus_musculus/mus_musculus.vcf.gz<br>References:[1] Haesussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, et al. The UCSC Genome Browser database: 2019 Update. Nucleic Acids Research. 2019.[2] 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015.[3] Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, et al. Ensembl 2018. Nucleic Acids Research. 2018.

适用于x64架构机器的7套BGZF压缩格式(Block Gzip Format)案例研究数据集。 本数据集包含来自UCSC基因组浏览器(UCSC Genome Browser)、千人基因组计划(1000 Genomes Project)以及Ensembl数据库(Ensembl)的数据集,已被制备为解压测试语料库。其中所有BCF文件(Binary Variant Call Format)均通过bcftools 1.9版本的`bcftools view -Ob`命令,从vcf.gz格式文件转换得到。原始数据集的直接下载链接如下: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/wgEncodeCaltechRnaSeqGm12878R1x75dAlignsRep1V2.bam http://hgdownload.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeUwRnaSeq/wgEncodeUwRnaSeqThymusCellPolyaMAdult8wksC57bl6AlnRep1.bam http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz http://ftp.ensembl.org/pub/release-93/variation/vcf/mus_musculus/mus_musculus.vcf.gz 参考文献: [1] Haesussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, 等. UCSC基因组浏览器数据库:2019年更新. 《核酸研究》, 2019. [2] 千人基因组计划联盟. 人类遗传变异的全球参考图谱. 《自然》, 2015. [3] Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, 等. Ensembl 2018. 《核酸研究》, 2018.
提供机构:
figshare
创建时间:
2019-05-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作