five

Data from: Two influential primate classifications logically aligned

收藏
DataONE2016-03-22 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Classifications and phylogenies of perceived natural entities change in the light of new evidence. Taxonomic changes, translated into Code-compliant names, frequently lead to name:meaning dissociations across succeeding treatments. Classification standards such as the Mammal Species of the World (MSW) may experience significant levels of taxonomic change from one edition to the next, with potential costs to long-term, large-scale information integration. This circumstance challenges the biodiversity and phylogenetic data communities to express taxonomic congruence and incongruence in ways that both humans and machines can process, that is, to logically represent taxonomic alignments across multiple classifications. We demonstrate that such alignments are feasible for two classifications of primates corresponding to the second and third MSW editions. Our approach has three main components: (i) use of taxonomic concept labels, that is name sec. author (where sec. means according to), to assemble each concept hierarchy separately via parent/child relationships; (ii) articulation of select concepts across the two hierarchies with user-provided Region Connection Calculus (RCC-5) relationships; and (iii) the use of an Answer Set Programming toolkit to infer and visualize logically consistent alignments of these input constraints. Our use case entails the Primates sec. Groves (1993; MSW2–317 taxonomic concepts; 233 at the species level) and Primates sec. Groves (2005; MSW3–483 taxonomic concepts; 376 at the species level). Using 402 RCC-5 input articulations, the reasoning process yields a single, consistent alignment and 153,111 Maximally Informative Relations that constitute a comprehensive meaning resolution map for every concept pair in the Primates sec. MSW2/MSW3. The complete alignment, and various partitions thereof, facilitate quantitative analyses of name:meaning dissociation, revealing that nearly one in three taxonomic names are not reliable across treatments—in the sense of the same name identifying congruent taxonomic meanings. The RCC-5 alignment approach is potentially widely applicable in systematics and can achieve scalable, precise resolution of semantically evolving name usages in synthetic, next-generation biodiversity, and phylogeny data platforms.

人类所认知的自然实体的分类体系与系统发育关系,会随着新证据的出现而发生变更。分类学修订若转换为符合生物命名法规的规范名称,常会导致后续不同分类处理方案中出现名称与含义脱节的问题。诸如《世界哺乳动物物种》(Mammal Species of the World, MSW)这类分类标准,其各版本间往往会发生大量分类学修订,这可能为长期、大规模的生物信息整合工作带来潜在障碍。这一现状给生物多样性与系统发育数据领域带来了挑战:需要以人类与机器均可处理的方式,表达分类学的一致性与不一致性,即对多种分类体系间的分类学对应关系进行逻辑化表征。本研究证明,针对对应《世界哺乳动物物种》第二版与第三版的两类灵长类分类体系,构建此类对应关系是可行的。我们的研究方案主要包含三个核心部分:(i) 采用分类学概念(taxonomic concept)标签——即格式为"名称 sec. 作者"(其中sec.意为"依照")的标识——通过父子层级关系分别构建各概念层级体系;(ii) 借助用户提供的区域连接演算(Region Connection Calculus, RCC-5)关系,对两个层级体系中的选定概念进行关联阐明;(iii) 利用回答集编程(Answer Set Programming)工具包,对上述输入约束条件进行逻辑推理与可视化,以生成符合逻辑一致性的分类学对应关系。本研究的用例涵盖两类以Groves的分类体系为依据的灵长类分类方案:1993年版(对应MSW第二版,包含317个分类学概念,其中物种层级概念233个)与2005年版(对应MSW第三版,包含483个分类学概念,其中物种层级概念376个)。通过402条RCC-5输入关联关系,本推理过程仅生成1组一致的分类学对应关系,同时得到153111条最大信息关系(Maximally Informative Relations),这些关系共同构成了MSW第二版与第三版灵长类分类体系中所有概念对的完整含义解析图谱。完整的对应关系及其多种划分方式可支持名称-含义脱节问题的量化分析,分析结果显示,近三分之一的分类学名称在不同处理方案中并不具备可靠性——即同一名称所指代的分类学含义并不一致。RCC-5对应关系方法在系统分类学领域具备广泛的应用潜力,可在合成型下一代生物多样性与系统发育数据平台中,实现对语义演化的名称用法的规模化、精准解析。
创建时间:
2016-03-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作