five

Languages of Caquetá-Putumayo database

收藏
DataCite Commons2025-03-18 更新2025-04-17 收录
下载链接:
https://danebadawcze.uw.edu.pl/citation?persistentId=doi:10.58132/Y2C5V1
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is based on grammatical data from seven Indigenous languages spoken in the Caquetá-Putumayo River Basins of Colombia. The languages belong to three language families—Witotoan (Murui-Muina, Ocaina, Nonuya), Boran (Bora, Muinane), and Arawak (Resígaro)—as well as one linguistic isolate, Andoke. Initially based on data from the Hunter-Gatherer Language Database (HGDL), the dataset has since been expanded and restructured using a set of grammatical categories specifically designed to capture key morphosyntactic features of languages from Northwest Amazonia.The dataset was collected and verified by linguist Dr. Katarzyna I. Wojtylak through a rigorous process of cross-checking against existing linguistic literature, her own field expertise, and consultations with specialists in these languages. While the initial data drew from the HGDL framework, it became evident that additional grammatical categories were necessary to more accurately represent the structures of the region’s languages. As a result, a new typological framework was developed, incorporating features particularly relevant to the grammatical description of Northwest Amazonian languages, such as classifier systems, evidentiality, switch reference, and complex predicates.The dataset is organized into a hierarchical structure reflecting these expanded grammatical categories. Each feature is systematically coded as present, absent, or unknown, allowing for comparative analysis both within and across language families. By distinguishing inherited traits from contact-induced features, this dataset provides insights into the historical and contemporary dynamics of grammatical structures in this multilingual region.Through both quantitative and qualitative analyses, the dataset contributes to the study of areal patterns, grammatical diffusion, and the interplay between genealogical inheritance and language contact. It offers a valuable resource for typologists, historical linguists, and researchers focusing on Amazonian languages.The dataset is structured into CSV files, categorizing each language’s grammatical features in a format suitable for comparative linguistic research. Available under a CC BY license, this dataset represents a significant contribution to the understanding of grammatical diversity in Northwest Amazonia.
提供机构:
Dane Badawcze UW
创建时间:
2025-03-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作