five

Nemabiome ITS Database

收藏
Figshare2024-12-12 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Nemabiome_ITS_Database/28013753
下载链接
链接失效反馈
官方服务:
资源简介:
18S, ITS1 and ITS2, 28S Full Nematode Database: Building a NanoCLUST Db for Parasitic Nematodes using 18S rRNA, 28S rRNA, ITS1, 5.8S and ITS2. Nematoda Taxonomy ID: 6231 (hence must use txid6231). Key words:18S ribosomal RNA18S rRNA18S28S ribosomal RNA28S rRNA28S5.8S ribosomal RNA5.8S rRNA5.8SRibosomal RNASSU rRNALSU rRNASSU ribosomal RNALSU ribosomal RNAInternal transcribed spacerInternal transcribed spacer 1Internal transcribed spacer 2ITSITS1ITS2Final NCBI GenBank search term:(((((((((((((((((((((((((18S ribosomal RNA[Title]) OR 18S rRNA[Title]) OR 18S[Title]) OR 28S ribosomal RNA[Title]) OR 28S rRNA[Title]) OR 28S[Title]) OR 5.8S ribosomal RNA[Title]) OR 5.8S rRNA[Title]) OR 5.8S[Title]) OR ribosomal RNA[Title]) OR SSU rRNA[Title]) OR LSU rRNA[Title]) OR SSU ribosomal RNA[Title]) OR LSU ribosomal RNA[Title]) OR Internal transcribed spacer[Title]) OR Internal transcribed spacer 1[Title]) OR Internal transcribed spacer 2[Title]) OR ITS[Title]) OR ITS1[Title]) OR ITS2[Title]) AND txid6231[Organism])) AND 200:10000[Sequence Length])) AND nuccore pubmed[Filter]) NOT unverified[Keyword]Downloaded as a fasta file.Next a list of clade III and V parasitic nematodes i.e. the Ascarids, Ancylostomatids, etc were obtained – these downloaded as a fasta file.Next this fasta file had the titles of the sequences changed to ‘sham’ titles to non-descript accession numbers e.g. Unidentified nematode 18S ribosomal RNA, partial sequence, # Simplify the headers of your database fasta file$ awk '{if($0~/^>/){print $1} else {print $0}}' Nemabiome_rRNA_fasta_v5_sequences.fasta > Nematoda_rRNA-ITS-5.8S_v5_30.04.24.fasta# Make a text file of all the accession numbers in the database fasta file$ awk '{if ($1~/^>/) print substr($1,2)}' Nematoda_rRNA-ITS-5.8S_v5_30.04.24.fasta > Nematoda_rRNA_v5_accession_ids.txt# Create a mapping table of each accession to its taxa id - takes about 10 minutes as it has to read each of the 300 million lines nucl_gb.accession2taxid$ awk -F"\t" 'BEGIN{while(getline Nematode_rRNA_v5_tax_map.txt# Make the blast database using the database fasta file for example: $ makeblastdb -in Nematoda_rRNA-ITS-5.8S_v5_30.04.24.fasta -parse_seqids -blastdb_version 5 -taxid_map Nematode_rRNA_v5_tax_map.txt -title "Nemabiome_rRNA database_v5" -out Nemabiome_rRNA_v5_db -dbtype nuclFinal database files produced = 10. For example Nemabiome_rRNA_v5_db.ndb, Nemabiome_rRNA_v5_db.nhr, Nemabiome_rRNA_v5_db.ninThese can be used by NanoCLUST e.g. in the command nextflow run main.nf -profile docker --reads '/home/Public/Ps1/Lucas_Workspace/MinION_Nemabiome_ECR_Project_SUPdata/100_Sample_Comparison_96-well_Trial-4/pass/barcodes01-96/ITS-reads-amended-filtered/barcode30.tmp.inverse.pblat.fix.fastq-filt.fastq.gz' --db "db/Nemabiome_rRNA_v5_db" --tax "db" --min_read_length 700 --max_read_length 1800 --min_cluster_size 100 --polishing_reads 100 --cluster_sel_epsilon 1 --max_memory ’84.GB’ --max_cpus 12 --outdir ./Nemabiome_trial
创建时间:
2024-12-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作