five

Additional file 2 of The genome of a sea spider corroborates a shared Hox cluster motif in arthropods with a reduced posterior tagma

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Additional_file_2_of_The_genome_of_a_sea_spider_corroborates_a_shared_Hox_cluster_motif_in_arthropods_with_a_reduced_posterior_tagma/29455928
下载链接
链接失效反馈
官方服务:
资源简介:
Additional file 2. Table 1. Overview of the sequencing data generated for the project. Technology: sequencing platform used. De novo: whether the data was used to assemble a de-novo transcriptome. Annotation: whether the transcriptomic data was used to predict protein-coding genes. Accession: European Nucleotide Archive Accession IDs. Developmental staging following [37–39]. Table 2. QUAST and BUSCO Arthropoda scores that document the progress of the Flye-based assembly. Table 3. presence/absence overview of microRNA families predicted in the Pycnogonum litorale genome by MirMachine and supplemented with manual curation. Table 4. overview of gene models and the names assigned via phylogenetic analysis. Table 5. Overview of the best hits for chelicerate abdA sequences (also see Additional File 1: Fig. S4). The underlying sequences can be found in Additional File 8 ( https://explore.openaire.eu/search/dataset?pid=10.5281/zenodo.14362378 ). Gene tree for the P. litorale Hox cluster. P. litorale gene models are highlighted with stars and transcripts with triangles. Colors follow the scheme from [49]. Bootstrap support values are noted on the branches. Table 6: List of NCBI BLAST hits for gene model r2_g3735, predicted to reside in the Hox cluster of P. litorale. Gene tree for the P. litorale HRO cluster. P. litorale gene models are highlighted with blue squares. The paraphyletic Hbn tree has not been colored. Bootstrap support values are noted on the branches. Gene tree for the P. litorale IRX cluster. P. litorale gene models highlighted with blue squares. Bootstrap support values are noted on the branches. Gene tree for the P. litorale SINE cluster. P. litorale gene models highlighted with dark blue squares. Bootstrap support values are noted on the branches. Gene tree for the P. litorale NK/NK2 cluster. P. litorale gene models highlighted with dark blue squares. Colors follow the scheme from [49]. Bootstrap support values are noted on the branches. Reduced gene tree for the P. litorale NK/NK2 cluster including Dbx sequences. Numbers on the branches are distances as calculated in the neighbor-joining tree. Color is automatically applied by Jalview to visually separate clades at the chosen cut (red line). P. litorale gene models include g1744, g1756, g11364, and at_DN2391. Comparison of COX1 (cytochrome oxidase I) sequences between lab culture (labelled “Helgoland”), the wildtype animals used (labelled “Maine”) and the NCBI entries MG934985, MG935177, MG935394, and HM425354 (P. litorale COX1 partial CDS). Genome alignment produced by Geneious v10.2.6. Comparison of COX1 (cytochrome oxidase I) sequence identity between lab culture (labelled “Helgoland”), the wildtype animals used (labelled “Maine”) and the NCBI entries MG934985, MG935177, MG935394, and HM425354 (P. litorale COX1 partial CDS). Table 7. Iso-seq mixing strategy for the various developmental stages. We aimed for an approximately equimolar mix while trying to reach recommended concentrations for PacBio sequencing and considering the total amount of RNA extracted from each developmental stage. The RIN number is a score of RNA integrity [146] with values ranging from 10 (intact) to 1 (totally degraded); DV200 denotes the percentage of RNA fragments with length greater than 200 nucleotides. Table 8. Chelicerate repeat content, broken down by common repeat families. Manually extracted from various publications. Table 9: Table of genome assembly statistics for arthropod genomes. Obtained from NCBI. Table 10. Arthropod repeat content. Chelicerate, myriapod, and hexapod data as reported by Sheffer et al. in Table 4 [47]. Crustacean data as reported by Cui et al. [147]. Table 11. List of chelicerate reference genome assemblies found on NCBI that were at least scaffold level. Ticks and mites are overrepresented; when multiple species from the same genus were present, we chose the one with more genes, as generally most ticks and mites with sequenced genomes are parasitic and have reduced genomes. Among the remaining chelicerate taxa, spiders are overrepresented; here we chose the species with the least predicted gene models for each genus, in the hope that we would avoid false positives. We excluded A. ventricosus as it contains an uncharacteristic number of predicted proteins. The number of genes corresponds to the total number of genes in the annotation, not the protein-coding ones. Gene-level analyses. Includes folders for the analysis of the HOX, NK/NK2, HRO, IRX, and SINE clusters, as well as the r2_g3735 gene. The Hox subfolder also contains the sequences and alignments used for the abdA transcriptome search. More details can be found in the corresponding notebooks [45] (under 07-analysis/). Table 12: Mmseqs2 alignment of the predicted P. litorale proteome against itself. The columns hold, in order: query sequence identifier, target sequence identifier, fraction of identical matches (fident), alignment length (number of aligned columns), number of mismatches, number of gap open events, alignment start position in the query, alignment end position in the query, alignment start position in the target, alignment end position in the target, e-value of the alignment, bit score of the alignment, query sequence length. Underlies the self-synteny analysis.
创建时间:
2025-07-02
二维码
社区交流群
二维码
科研交流群
商业服务