five

CO1-haplotype frequencies of long-spined sea urchin (Diadema setosum) from the Indo-Malay archipelago

收藏
DataCite Commons2022-10-05 更新2025-04-16 收录
下载链接:
https://dataverse.ird.fr/citation?persistentId=doi:10.23708/ZWQEFN
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains three files in text (.txt) format or in tabulation-separated value (.tsv) format, that together characterize haplotype composition at the COI locus in the long-spined sea urchin Diadema setosum from the Indo-Malay archipelago. The dataset was produced in the course of a phylogeographic study of this tropical Indo-Pacific species, itself part of IBV’s PhD project on the comparative phylogeography of Indo-Pacific sea-urchins of the genera Diadema and Echinometra. METHODS Briefly, long-spined sea urchins were sampled throughout the Indo-Malay archipelago between July 2019 and November 2021. Gonad tissue was dissected and preserved in ethanol. Genomic DNA was extracted and a 1157-nucleotide long segment beginning at the 5’ end of the mitochondrial cytochrome oxidase subunit I (COI) gene was amplified by polymerase-chain reaction according to Ivanova & Grainger’s (2007) protocols. Amplicons were sequenced according to the Sanger protocol. Sequence chromatograms were verified under Chromas v. 2.6.5 (Technelysium, Brisbane, Australia). All nucleotide sequences were deposited in GenBank (Clark et al. 2016) and allocated accession numbers OP310072 to OP310789. Nucleotide sequences were aligned using the ClustalW algorithm implemented in MegaX v. 10.0.4 (Kumar et al. 2018). DESCRIPTION OF DATASET FILE1 is a table in .tsv format, containing the sampling details of Diadema setosum in the Indo-Malay archipelago. The contents of the columns are the following:“Sequence_ID”: a unique identifier for each nucleotide sequence (see legend to File 2); “Organism”: here, Diadema setosum; “Isolate”: unique identifier given to the DNA extract, identical to the sequence identifier; “Location”: information is inserted in inverted commas and includes country, oceanic region, and precise name of sampling site; “lat_lon”: latitude, followed by longitude, both in decimal degrees; “Sampling_date”: three-letter or four-letter abbreviation for month, followed by year; “GenBank_no”: GenBank accession number (two letters immediately followed by six digits). FILE2 is in FASTA (.txt) format. It contains the alignment of nucleotide sequences of 718 Diadema setosum individuals from the Indo-Malay archipelago, over 1157-nucleotide long portion of the CO1 gene. Sequence labels, each preceded by a greater-than sign (“>”) and ended by a line break, were constructed as following: identifier of individual (e.g., “SBD-1”), followed by abbreviation of sampling site (“Sa1”), followed by haplotype identifier (“H1”). The line that follows each sequence label is the nucleotide sequence under that label. FILE3, in .tsv format, is a table containing haplotype frequencies by sample. The first line is the headings of the columns; abbreviations for samples are as in File 1. The following 259 lines contain the numbers of haplotypes by sample. Haplotypes are identified by the same identifiers (“Haplotype_ID”) as those used for individual labels in File 1. The last line of the table contains the sums of each column, equal to sample sizes. The last column contains the sums of each line, equal to haplotype frequencies in the total sample. REFERENCES Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., Sayers, E. W. (2016). GenBank. Nucleic Acids Research 44, D67–D72. doi:10.1093/nar/gkv1276. Ivanova, N., Grainger, C. (2007). COI amplification: Taq polymerase choice. CCDB Protocols (Can Ctr DNA Barcoding, Guelph). www.dnabarcoding.ca. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Molecular Biology and Evolution 35:1547-1549. doi:10.1093/molbev/msy096.
提供机构:
DataSuds
创建时间:
2022-09-29
二维码
社区交流群
二维码
科研交流群
商业服务