Mash Sketch of RefSeq Bacterial Representative Genomes v217
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7887020
下载链接
链接失效反馈官方服务:
资源简介:
This was created to get a new mash reference that was current. The script to create this uses ncbi datasets and mash (https://github.com/UPHL-BioNGS/Grandeur/blob/main/bin/new_mash_ref.sh)
This was created on April 27, 2023, and is RefSeq v217
```bash
#/bin/bash
out=mash_db
mkdir $out
cd $out
datasets summary genome taxon bacteria --reference --as-json-lines | \
dataformat tsv genome --fields accession,assminfo-refseq-category,organism-name --elide-header | \
grep representative | \
tee representative_genomes.txt | \
cut -f 1 > genome_ids.txt
echo "$(date): Downloading genomes for ids"
datasets download genome accession --inputfile genome_ids.txt --filename rep-genomes.zip
echo "$(date): Decompressing zip file"
unzip rep-genomes.zip
echo "$(date): Creating file for mash"
cat ncbi_dataset/data/*/*.fna | sed 's/ /_/g' | sed 's/,//g' > rep-genomes.fasta
echo "$(date): Skeching rep-genomes.fasta"
mash sketch -i -p 20 rep-genomes.fasta -o rep-genomes
############################################################
echo "$(date): File preparation is complete"
ls -alh rep-genomes.fasta
ls -alh rep-genomes.msh
```
创建时间:
2023-05-03



