Data from: Plastome sequencing of ten nonmodel crop species uncovers a large insertion of mitochondrial DNA in cashew

DataONE2017-08-21 更新2024-06-26 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

资源简介：

In plant evolution, intracellular gene transfer (IGT) is a prevalent, ongoing process. While nuclear and mitochondrial genomes are known to integrate foreign DNA via IGT and horizontal gene transfer (HGT), plastid genomes (plastomes) have resisted foreign DNA incorporation and only recently has IGT been uncovered in the plastomes of a few land plants. In this study, we completed plastome sequences for l0 crop species and describe a number of structural features including variation in gene and intron content, inversions, and expansion and contraction of the inverted repeat (IR). We identified a putative rpl22 in cinnamon (Cinnamomum verum J. Presl) and other sequenced Lauraceae and an apparent functional transfer of rpl23 to the nucleus of quinoa (Chenopodium quinoa Willd.). In the orchard tree cashew (Anacardium occidentale L.), we report the insertion of an ~6.7- kb fragment of mitochondrial DNA into the plastome IR. BLASTn analyses returned high identity hits to mitogenome sequences including an intact ccmB open reading frame. Using three plastome markers for five species of Anacardium, we generated a phylogeny to investigate the distribution and timing of the insertion. Four species share the insertion, suggesting that this event occurred <20 million yr ago in a single clade in the genus. Our study extends the observation of mitochondrial to plastome IGT to include long-lived tree species. While previous studies have suggested possible mechanisms facilitating IGT to the plastome, more examples of this phenomenon, along with more complete mitogenome sequences, will be required before a common, or variable, mechanism can be elucidated.

创建时间：

2017-08-21

用户留言

有没有相关的论文或文献参考？

这个数据集是基于什么背景创建的？

数据集的作者是谁？

能帮我联系到这个数据集的作者吗？

这个数据集如何下载？

点击留言

数据主题

具身智能

数据集 4098个

机构 8个

大模型

数据集 439个

机构 10个

无人机

数据集 37个

机构 6个

指令微调

数据集 36个

机构 6个

蛋白质结构

数据集 50个

机构 8个

空间智能

数据集 21个

机构 5个

5,000+

优质数据集

54 个

任务类型

进入经典数据集

热门数据集

MultiTalk

MultiTalk数据集是由韩国科学技术院创建，包含超过420小时的2D视频，涵盖20种不同语言，旨在解决多语言环境下3D说话头生成的问题。该数据集通过自动化管道从YouTube收集，每段视频都配有语言标签和伪转录，部分视频还包含伪3D网格顶点。数据集的创建过程包括视频收集、主动说话者验证和正面人脸验证，确保数据质量。MultiTalk数据集的应用领域主要集中在提升多语言3D说话头生成的准确性和表现力，通过引入语言特定风格嵌入，使模型能够捕捉每种语言独特的嘴部运动。

arXiv 收录

AISHELL/AISHELL-1

Aishell是一个开源的中文普通话语音语料库，由北京壳壳科技有限公司发布。数据集包含了来自中国不同口音地区的400人的录音，录音在安静的室内环境中使用高保真麦克风进行，并下采样至16kHz。通过专业的语音标注和严格的质量检查，手动转录的准确率超过95%。该数据集免费供学术使用，旨在为语音识别领域的新研究人员提供适量的数据。

hugging_face 收录

MMRS - 多模态遥感指令跟随数据集

MMRS数据集，包含了100万多个图像-文本对，涵盖了分类、检测、图像描述、VQA、视觉定位等多个任务，并包括光学、红外和SAR三种视觉模态。该数据集旨在促进遥感领域中MLLMs的持续发展。

arXiv 收录

LUNA16

LUNA16（肺结节分析）数据集是用于肺分割的数据集。它由 1,186 个肺结节组成，在 888 次 CT 扫描中进行了注释。

OpenDataLab 收录

Billboard-Hot-100

该数据集包含了自1958年以来所有Billboard Hot 100榜单的历史数据，详细记录了每首歌曲的排名、日期、表演者等信息。

github 收录