Global Bemisia dataset release version 31 Dec 2012
收藏DataCite Commons2023-10-17 更新2025-04-09 收录
下载链接:
https://data.csiro.au/collections/#collection/CIcsiro:6002v2
下载链接
链接失效反馈官方服务:
资源简介:
ERRATA: PLEASE USE THE FILES IN THE MORE RECENT VERSION BY CLICKING THE LINK ABOVE. THERE ARE ERRORS IN THE FILES FOR THIS VERSION. SUPPORTING FILES BELOW CONTAINS INFORMATION ABOUT THE CHANGES, see Errata_Readme.txt. The original dataset housed in the CSIRO data portal was accumulated over a seven year period and uploaded in 31 Dec 2012. During this time multiple researchers and students contributed sequences and worked on the dataset. Many were entered into the dataset before going into GenBank and the accession details were added later once the contributor had uploaded the sequences into GenBank. This has meant that some of the sequences do not now reflect what is in GenBank which is source of confusion. We have also become aware that over this period various errors have crept into the dataset. We have therefore decided to rebuild the dataset using the same accessions, but using the data as it now appears in GenBank. The original dataset had 558 sequences. This included one sequence that had not been uploaded into GenBank and 2 duplicates. These have been removed leaving 555 sequences. Of these 11 contained either stop codons or indels that caused frameshifts and these have been removed leaving 544 sequences. You can read the file note to see which sequences were removed. There are now two files.
1) FINAL Unaligned 544 sequences dataset 14 May 2017 unedited.fas. This is the original source data as it appears in GenBank. The only change is that the 11 error sequences in the original 555 sequence downloaded from GenBank have been removed.
2) FINAL Aligned 544 sequences 14 May 2017.fas. The downloaded sequences have been trimmed to 657 b length. There are no other edits.
Mitochondrial cytochrome oxidase subunit one gene, partial cds, sequence data. The dataset was compiled to help researchers wanting to undertake a phylogenetic analysis of the Bemisia tabaci species complex; to assign unknowns to already identified members of the Bemisia tabaci species complex or to identify putatuve new species belonging to the complex. All sequences are available publicly through GenBank and all accession numbers are included. The dataset contains 558 unique Bemisi tabaci plus outgroup sequences . The sequences are 657 bases in length starting with GAAAATTAGAGGT (using Middle East-Asia Minor 1 DQ174535 as the alignment template) and were aligned using ClustalX (ver. 1.81) according to Thompson et al. (1997). All sequences had no gaps or pseudogenes and ambiguous bases were <0.8%.
提供机构:
CSIRO
创建时间:
2013-01-07



