Data from: The pipid root|系统发育学数据集|分子系统学数据集

Mendeley Data2024-06-25 更新2024-06-27 收录

系统发育学

分子系统学

下载链接：

https://zenodo.org/records/4960892

下载链接

链接失效反馈

资源简介：

The estimation of phylogenetic relationships is an essential component of understanding evolution. Accurate phylogenetic estimation is difficult, however, when internodes are short and old, when genealogical discordance is common due to large ancestral effective population sizes or ancestral population structure, and when homoplasy is prevalent. Inference of divergence times is also hampered by unknown and uneven rates of evolution, the incomplete fossil record, uncertainty in relationships between fossil and extant lineages, and uncertainty in the age of fossils. Ideally these challenges can be overcome by developing large "phylogenomic" datasets, and by analyzing them with methods that accommodate features of the evolutionary process such as genealogical discordance, recurrent substitution, recombination, ancestral population structure, gene flow after speciation among sampled and unsampled taxa, and variation in evolutionary rates. In some phylogenetic problems it is possible to use information that is independent of fossils, such as the geological record, to identify putative triggers for diversification whose associated estimated divergence times can then be compared a posteriori to estimated relationships and ages of fossils. The history of diversification of pipid frog genera Pipa, Hymenochirus, Silurana, and Xenopus, for instance, is characterized by many of these evolutionary and analytical challenges. These frogs diversified dozens of millions of years ago, they have a relatively rich fossil record, their distributions span continental plates with a well characterized geological record of ancient connectivity, and there is considerable disagreement across studies in estimated evolutionary relationships. We used high throughput sequencing and public databases to generate a large phylogenomic dataset with which we estimated evolutionary relationships using multilocus coalescence methods. We collected sequence data from Pipa, Hymenochirus, Silurana, and Xenopus and the outgroup taxon Rhinophrynus dorsalis from coding sequence of 113 autosomal regions, averaging ~300 base pairs in length (range: 102 – 1695 base pairs), and also a portion of the mitochondrial genome. Analysis of these data using multiple approaches recovers strong support for the ((Xenopus, Silurana)(Pipa, Hymenochirus)) topology, and geologically calibrated divergence time estimates that are consistent with estimated ages and phylogenetic affinities of many fossils. These results provide new insights into the biogeography and chronology of pipid diversification during the breakup of Gondwanaland, and illustrate how phylogenomic data may be necessary to tackle tough problems in molecular systematics.

创建时间：

2023-06-28

用户留言

有没有相关的论文或文献参考？

这个数据集是基于什么背景创建的？

数据集的作者是谁？

能帮我联系到这个数据集的作者吗？

这个数据集如何下载？

点击留言

数据主题

具身智能

数据集 4098个

机构 8个

大模型

数据集 439个

机构 10个

无人机

数据集 37个

机构 6个

指令微调

数据集 36个

机构 6个

蛋白质结构

数据集 50个

机构 8个

空间智能

数据集 21个

机构 5个

5,000+

优质数据集

54 个

任务类型

进入经典数据集

热门数据集

中国空气质量数据集（2014-2020年）

数据集中的空气质量数据类型包括PM2.5, PM10, SO2, NO2, O3, CO, AQI，包含了2014-2020年全国360个城市的逐日空气质量监测数据。监测数据来自中国环境监测总站的全国城市空气质量实时发布平台，每日更新。数据集的原始文件为CSV的文本记录，通过空间化处理生产出Shape格式的空间数据。数据集包括CSV格式和Shape格式两数数据格式。

国家地球系统科学数据中心收录

HazyDet

HazyDet是由解放军工程大学等机构创建的一个大规模数据集，专门用于雾霾场景下的无人机视角物体检测。该数据集包含383,000个真实世界实例，收集自自然雾霾环境和正常场景中人工添加的雾霾效果，以模拟恶劣天气条件。数据集的创建过程结合了深度估计和大气散射模型，确保了数据的真实性和多样性。HazyDet主要应用于无人机在恶劣天气条件下的物体检测，旨在提高无人机在复杂环境中的感知能力。

arXiv 收录

poi

本项目收集国内POI兴趣点，当前版本数据来自于openstreetmap。

github 收录

flames-and-smoke-datasets

该仓库总结了多个公开的火焰和烟雾数据集，包括DFS、D-Fire dataset、FASDD、FLAME、BoWFire、VisiFire、fire-smoke-detect-yolov4、Forest Fire等数据集。每个数据集都有详细的描述，包括数据来源、图像数量、标注信息等。

github 收录

Traditional-Chinese-Medicine-Dataset-SFT

该数据集是一个高质量的中医数据集，主要由非网络来源的内部数据构成，包含约1GB的中医各个领域临床案例、名家典籍、医学百科、名词解释等优质内容。数据集99%为简体中文内容，质量优异，信息密度可观。数据集适用于预训练或继续预训练用途，未来将继续发布针对SFT/IFT的多轮对话和问答数据集。数据集可以独立使用，但建议先使用配套的预训练数据集对模型进行继续预训练后，再使用该数据集进行进一步的指令微调。数据集还包含一定比例的中文常识、中文多轮对话数据以及古文/文言文<->现代文翻译数据，以避免灾难性遗忘并加强模型表现。

huggingface 收录