Data from: Unlocking the vault: next generation museum population genomics|基因组学数据集|博物馆收藏数据集

DataONE2013-09-10 更新2024-06-27 收录

基因组学

博物馆收藏

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

资源简介：

Natural history museum collections provide unique resources for understanding how species respond to environmental change, including the abrupt, anthropogenic climate change of the past century. Ideally, researchers would conduct genome-scale screening of museum specimens to explore the evolutionary consequences of environmental changes, but to date such analyses have been severely limited by the numerous challenges of working with the highly degraded DNA typical of historic samples. Here we circumvent these challenges by using custom, multiplexed, exon-capture to enrich and sequence ~11,000 exons (~4Mb) from early 20th century museum skins. We used this approach to test for changes in genomic diversity accompanying a climate-related range retraction in the alpine chipmunks (Tamias alpinus) in the high Sierra Nevada area of California, USA. We developed robust bioinformatic pipelines that rigorously detect and filter-out base misincorporations in DNA derived from skins, most of which likely resulted from post-mortem damage. Furthermore, to accommodate genotyping uncertainties associated with low-medium coverage data, we applied a recently developed probabilistic method to call SNPs and estimate allele frequencies and the joint site frequency spectrum. Our results show increased genetic subdivision following range retraction, but no change in overall genetic diversity at either non-synonymous or synonymous sites. This case study showcases the advantages of integrating emerging genomic and statistical tools in museum collection-based population genomic applications. Such technical advances greatly enhance the value of museum collections, even where a pre-existing reference is lacking, and points to a broad range of potential applications in evolutionary and conservation biology.

创建时间：

2013-09-10

用户留言

有没有相关的论文或文献参考？

这个数据集是基于什么背景创建的？

数据集的作者是谁？

能帮我联系到这个数据集的作者吗？

这个数据集如何下载？

点击留言

数据主题

具身智能

数据集 4098个

机构 8个

大模型

数据集 439个

机构 10个

无人机

数据集 37个

机构 6个

指令微调

数据集 36个

机构 6个

蛋白质结构

数据集 50个

机构 8个

空间智能

数据集 21个

机构 5个

5,000+

优质数据集

54 个

任务类型

进入经典数据集

热门数据集

中国食物成分数据库

食物成分数据比较准确而详细地描述农作物、水产类、畜禽肉类等人类赖以生存的基本食物的品质和营养成分含量。它是一个重要的我国公共卫生数据和营养信息资源，是提供人类基本需求和基本社会保障的先决条件；也是一个国家制定相关法规标准、实施有关营养政策、开展食品贸易和进行营养健康教育的基础，兼具学术、经济、社会等多种价值。本数据集收录了基于2002年食物成分表的1506条食物的31项营养成分（含胆固醇）数据，657条食物的18种氨基酸数据、441条食物的32种脂肪酸数据、130条食物的碘数据、114条食物的大豆异黄酮数据。

国家人口健康科学数据中心收录

CE-CSL

CE-CSL数据集是由哈尔滨工程大学智能科学与工程学院创建的中文连续手语数据集，旨在解决现有数据集在复杂环境下的局限性。该数据集包含5,988个从日常生活场景中收集的连续手语视频片段，涵盖超过70种不同的复杂背景，确保了数据集的代表性和泛化能力。数据集的创建过程严格遵循实际应用导向，通过收集大量真实场景下的手语视频材料，覆盖了广泛的情境变化和环境复杂性。CE-CSL数据集主要应用于连续手语识别领域，旨在提高手语识别技术在复杂环境中的准确性和效率，促进聋人与听人社区之间的无障碍沟通。

arXiv 收录

MOOCs Dataset

该数据集包含了大规模开放在线课程（MOOCs）的相关数据，包括课程信息、用户行为、学习进度等。数据主要用于研究在线教育的行为模式和学习效果。

www.kaggle.com 收录

Air Quality Data from U.S. Embassies

该存储库包含从美国驻外使领馆收集的历史空气质量数据。

github 收录

rule34lol-images-part1

该数据集包含来自rule34.lol图像板的196,000个图像文件的元数据。元数据包括URL、标签、文件信息和点赞数。实际图像文件存储在zip存档中，每个存档包含1000个图像。该数据集是更大集合的一部分，分为Part 1和Part 2。数据集采用CC0许可，允许免费使用、修改和分发，无需署名。

huggingface 收录