five

18S and ITS read separation file and database

收藏
Figshare2024-12-12 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/18S_and_ITS_read_separation_file_and_database/28013723
下载链接
链接失效反馈
官方服务:
资源简介:
PBlat Read SeparationSequencing data was demultiplexed using MinKNOW and fastq files for each barcode concatenated prior to downstream analysis. Because both the 18S rDNA and ITS1-to-ITS2 sequences were pooled and given the same barcode for each sample analysed these firstly had to be separated using pblat (M. Wang & Kong, 2019) and seqtk (https://github.com/lh3/seqtk). To conduct separation and binning of 18S rDNA and ITS sequences into different files a database of nematode 18S rDNA sequences was built just using the region of the 18S rDNA targeted by our primers. Using a pblat minimum score value of 50, the 18S rDNA sequences from a barcode sequencing file were compared to our pblat 18S rDNA database and extracted to form an 18S rDNA sequence file, whilst the remaining sequences were used to form an ITS1-to-ITS2 sequence file. Code isconda activate pblatfor i in {01..75}; doDIR=/home/Public/Ps1/Lucas_Workspace/MinION_Nemabiome_ECR_Project_SUPdata/MinION_Nemabiome_Trial-2_Trich-Capi-Strongy/pass/looping_test_on_nemabiome_trial-2REF=18S_ref_seqs_for_Pblat_v2.fastaconda run -n seqtk seqtk seq -A /home/Public/Ps1/Lucas_Workspace/MinION_Nemabiome_ECR_Project_SUPdata/MinION_Nemabiome_Trial-2_Trich-Capi-Strongy/pass/looping_test_on_nemabiome_trial-2/barcode${i}-allfiles.fastq.gz > tmp.faconda run -n pblat pblat -noHead minScore=50 -threads=48 18S_ref_seqs_for_Pblat_v2.fasta tmp.fa barcode${i}.tmp.pblat.pslawk -F"\t" '{print $10 }' barcode${i}.tmp.pblat.psl | sort | uniq | wc -l > barcode${i}.count.txtawk '{if ($1>=50) print $10}' barcode${i}.tmp.pblat.psl | sort | uniq > barcode${i}.tmp.pblat.header.txtconda run -n seqtk seqtk subseq /home/Public/Ps1/Lucas_Workspace/MinION_Nemabiome_ECR_Project_SUPdata/MinION_Nemabiome_Trial-2_Trich-Capi-Strongy/pass/looping_test_on_nemabiome_trial-2/barcode${i}-allfiles.fastq.gz barcode${i}.tmp.pblat.header.txt > barcode${i}.tmp.pblat.fastqseqkit grep -v -f barcode${i}.tmp.inverse.pblat.fastqdoneNOTE:Barcode[#].tmp.pblat.fastq = fastq file of 18S readsBarcode[#].tmp.inverse.pblat.fastq = fastq file of ITS reads and other non-18S reads
创建时间:
2024-12-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作