five

Data from: Two influential primate classifications logically aligned

收藏
DataONE2016-03-22 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Classifications and phylogenies of perceived natural entities change in light of new evidence. Taxonomic changes, translated into Code-compliant names, frequently lead to name:meaning dissociations across succeeding treatments. Classification standards such as the Mammal Species of the World (MSW) may experience significant levels of taxonomic change from one edition to the next, with potential costs to long-term, large-scale information integration. This circumstance challenges the biodiversity and phylogenetic data communities to express taxonomic congruence and incongruence in ways that both humans and machines can process, i.e., to logically represent taxonomic alignments across multiple classifications. We demonstrate that such alignments are feasible for two classifications of primates corresponding to the second and third MSW editions. Our approach has three main components: (1) use of taxonomic concept labels, i.e. name sec. author (where sec. means according to), to assemble each concept hierarchy separately via parent/child relationships; (2) articulation of select concepts across the two hierarchies with user-provided Region Connection Calculus (RCC-5) relationships; and (3) the use of an Answer Set Programming toolkit to infer and visualize logically consistent alignments of these input constraints. Our use case entails the Primates sec. Groves (1993; MSW2 - 317 taxonomic concepts; 233 at the species level) and Primates sec. Groves (2005; MSW3 - 483 taxonomic concepts; 376 at the species level). Using 402 RCC-5 input articulations, the reasoning process yields a single, consistent alignment and 153,111 Maximally Informative Relations that constitute a comprehensive meaning resolution map for every concept pair in the Primates sec. MSW2/MSW3. The complete alignment, and various partitions thereof, facilitate quantitative analyses of name:meaning dissociation, revealing that nearly one in three taxonomic names are not reliable across treatments - in the sense of the same name identifying congruent taxonomic meanings. The RCC-5 alignment approach is potentially widely applicable in systematics and can achieve scalable, precise resolution of semantically evolving name usages in synthetic, next-generation biodiversity and phylogeny data platforms.

基于新的研究证据,学界对自然实体的分类体系与系统发育关系会发生修订。以符合命名规范的学名形式呈现的分类学修订,常会在后续的分类处理中导致名称与所指含义之间出现脱节。诸如《世界哺乳动物物种》(Mammal Species of the World, MSW)这类分类标准,其各版次间可能会出现大幅的分类学变动,这会对长期、大规模的信息整合工作带来潜在阻碍。这一现状对生物多样性与系统发育数据学界提出了挑战:需要以人类与机器均可处理的方式,表征分类学上的一致性与不一致性,也就是以逻辑化的方式呈现多套分类体系间的分类对应关系。我们以对应《世界哺乳动物物种》第二版与第三版的两套灵长类分类体系为例,证明了此类分类对应关系的构建是可行的。本方法主要包含三个核心模块:(1) 采用分类学概念标签——即「名称sec. 作者」(其中sec.意为「依据」)——通过父子层级关系分别构建各套概念层级体系;(2) 借助用户提供的区域连接演算(Region Connection Calculus, RCC-5)关系,对两套层级体系中的选定概念进行关联表述;(3) 利用回答集编程(Answer Set Programming, ASP)工具包,对上述输入约束进行推理,并可视化生成符合逻辑一致性的分类对应关系。本次研究的用例涵盖两套灵长类分类体系:一是Groves 1993年的分类体系(对应MSW2,包含317个分类学概念,其中物种级概念233个),二是Groves 2005年的分类体系(对应MSW3,包含483个分类学概念,其中物种级概念376个)。通过402条RCC-5输入关联关系,推理过程仅生成一套一致的分类对应关系,同时得到153,111条最大信息量关系(Maximally Informative Relations),这些关系构成了MSW2与MSW3版本灵长类分类体系中所有概念对的完整含义解析图谱。完整的分类对应关系及其各类子集可用于开展名称-含义脱节的定量分析,分析结果显示:近三分之一的分类学名称在不同处理中并不具备一致性——即同一名称所指代的分类学含义并不统一。RCC-5分类对应方法在系统分类学中具备广泛的应用潜力,可在合成型下一代生物多样性与系统发育数据平台中,实现对语义动态演化的名称用法的可扩展、高精度解析。
创建时间:
2016-03-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作