five

CLDF dataset of the Enggano word list from 1895 in Stokhof and Almanar's (1987) Holle List

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/8038974
下载链接
链接失效反馈
官方服务:
资源简介:
The repository for the digitised Enggano word list from 1895 (see Stokhof and Almanar 1987 for the original source) that has been matched with the digitised Holle List (Rajeg 2023a; cf. Stokhof 1980), providing the English and Indonesian glosses for the Enggano forms. The data set conforms to the Wordlist module of the Cross-Linguistic Data Format (CLDF) (Forkel et al. 2018). The work in this repository is part of the AHRC-funded research on Lexical resources for Enggano, a threatened language of Indonesia (visit the central webpage of the Enggano research and the specific repository of the Lexical Resources for Enggano project as well the main Enggano repository on the University of Oxford's Sustainable Digital Scholarship (SDS)) Updates in version 2.0.0 The following items summarise the major updates in version 2.0.0: Adding MediaTable to accommodate images in/for note ID <26> (commits dab9540 & a004003 at this line and these lines) Splitting multiple forms in a cell into their own rows, both for the original list and the forms in the Notes (commit 39cdc66 at this line and this line, and commit a004003 at this line) Orthography transliteration into Enggano's common orthography and IPA (across several commits and [closed] issues [#1 #3 #4 #5 #7], but see these lines for retrieving the existing orthography profile and doing the editing, and these lines for running the transliteration using the qlcData R package [Moran & Cysouw 2018; Cysouw 2024]) In the FormTable, the Form column contains the Enggano forms in their common orthography; the Value column contains their original transcription/orthography, with their tokenised/segmented formats available under the Graphemes column; the Segments column, finally, contains the segmented IPA transliteration of the Enggano forms (cf. #6 ). The Comment column is derived from the contents of the Notes. It includes, if any, Enggano forms in their original transcription followed by their segmented/tokenised forms in IPA in square brackets, their glosses in English (EN) and/or Indonesian (ID) inside the bracket, and finally the ID of the Notes in the original document inside angular brackets. The English and Indonesian columns respectively are glosses of the given language from the master/main Holle List (Stokhof 1980) that has been digitised (Rajeg 2023a). The output files of the orthography profiling and transliteration (commit 2aab3ab) are available in data-raw with the file names prefixed with ortho-.... References Cysouw, Michael. 2024. qlcData: Processing Data for Quantitative Language Comparison. https://cran.r-project.org/web/packages/qlcData/index.html. (25 December, 2024). Version 0.3 Forkel, Robert, Johann-Mattis List, Simon J. Greenhill, Christoph Rzymski, Sebastian Bank, Michael Cysouw, Harald Hammarström, Martin Haspelmath, Gereon A. Kaiping & Russell D. Gray. 2018. Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics. Scientific Data. Nature Publishing Group 5(1). 180205. https://doi.org/10.1038/sdata.2018.205. Moran, Steven & Michael Cysouw. 2018. The Unicode cookbook for linguists: Managing writing systems using orthography profiles (Translation and Multilingual Natural Language Processing 10). Berlin: Language Science Press. https://doi.org/10.5281/zenodo.1296780. Rajeg, Gede Primahadi Wijaya. 2023a. Digitised, Searchable Holle List in Stokhof (1980) [Data set]. (1.3.0). Zenodo. https://doi.org/10.5281/ZENODO.7972273. https://engganolang.github.io/digitised-holle-list/. https://ora.ox.ac.uk/objects/uuid:a511951b-86fb-4019-94d4-280efa83de02 Rajeg, Gede Primahadi Wijaya. 2023b. CLDF dataset of the Enggano word list from 1895 in Stokhof and Almanar's (1987) Holle List [Data set]. https://github.com/engganolang/holle-list-enggano-1895 https://doi.org/10.25446/oxford.23515788 Stokhof, W. A. L., ed. 1980. Holle Lists, Vocabularies in Languages of Indonesia, Vol. 1: Introductory Volume. Vol. Materials in Languages of Indonesia. Canberra, A.C.T., Australia: Dept. of Linguistics, Research School of Pacific Studies, The Australian National University. https://core.ac.uk/reader/159464813. Stokhof, W. A. L., and Alma E. Almanar. 1987. Holle Lists, Vocabularies in Languages of Indonesia, Vol. 10/3: Islands Off the West Coast of Sumatra. Vol. Materials in Languages of Indonesia. Pacific Linguistics (Series d) 76. Canberra, A.C.T., Australia: Dept. of Linguistics, Research School of Pacific Studies, The Australian National University. http://hdl.handle.net/1885/144589.
创建时间:
2025-04-04
二维码
社区交流群
二维码
科研交流群
商业服务