five

SEACrowd/unimorph_id

收藏
Hugging Face2024-06-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/SEACrowd/unimorph_id
下载链接
链接失效反馈
官方服务:
资源简介:
Unimorph Id数据集是UniMorph项目中的印尼语部分,主要用于形态学变化任务。由于UniMorph原始解析的稀疏性,该数据集使用了原始数据源。数据集支持的任务是形态学变化(Morphological Inflection)。数据集可以通过`datasets`库或`seacrowd`库加载。数据集的版本为Source: 1.0.0,SEACrowd: 2024.06.20,许可证为Creative Commons Attribution Share Alike 3.0 (cc-by-sa-3.0)。

The Unimorph Id dataset is the Indonesian chapter of the UniMorph project, primarily used for morphological inflection tasks. Due to the sparsity of the original UniMorph parsing, the dataset uses raw source data. The supported task is morphological inflection. The dataset can be loaded using the `datasets` library or the `seacrowd` library. The dataset version is Source: 1.0.0, SEACrowd: 2024.06.20, and the license is Creative Commons Attribution Share Alike 3.0 (cc-by-sa-3.0).
提供机构:
SEACrowd
原始信息汇总

数据集概述

数据集名称

Unimorph Id

语言

印尼语 (ind)

支持的任务

形态学屈折 (Morphological Inflection)

数据集版本

  • 源版本: 1.0.0
  • SEACrowd版本: 2024.06.20

数据集许可证

Creative Commons Attribution Share Alike 3.0 (cc-by-sa-3.0)

引用

如果使用该数据集,请引用以下文献:

@inproceedings{pimentel-ryskina-etal-2021-sigmorphon, title = "SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages", author = "Pimentel, Tiago and Ryskina, Maria and Mielke, Sabrina J. and Wu, Shijie and Chodroff, Eleanor and Leonard, Brian and Nicolai, Garrett and Ghanggo Ate, Yustinus and Khalifa, Salam and Habash, Nizar and El-Khaissi, Charbel and Goldman, Omer and Gasser, Michael and Lane, William and Coler, Matt and Oncevay, Arturo and Montoya Samame, Jaime Rafael and Silva Villegas, Gema Celeste and Ek, Adam and Bernardy, Jean-Philippe and Shcherbakov, Andrey and Bayyr-ool, Aziyana and Sheifer, Karina and Ganieva, Sofya and Plugaryov, Matvey and Klyachko, Elena and Salehi, Ali and Krizhanovsky, Andrew and Krizhanovsky, Natalia and Vania, Clara and Ivanova, Sardana and Salchak, Aelita and Straughn, Christopher and Liu, Zoey and Washington, Jonathan North and Ataman, Duygu and Kiera{s}, Witold and Woli{n}ski, Marcin and Suhardijanto, Totok and Stoehr, Niklas and Nuriah, Zahroh and Ratan, Shyam and Tyers, Francis M. and Ponti, Edoardo M. and Aiton, Grant and Hatcher, Richard J. and Prudhommeaux, Emily and Kumar, Ritesh and Hulden, Mans and Barta, Botond and Lakatos, Dorina and Szolnok, G{a}bor and {A}cs, Judit and Raj, Mohit and Yarowsky, David and Cotterell, Ryan and Ambridge, Ben and Vylomova, Ekaterina", booktitle = "Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.sigmorphon-1.25", doi = "10.18653/v1/2021.sigmorphon-1.25", pages = "229--259" }

@article{lovenia2024seacrowd, title={SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages}, author={Holy Lovenia and Rahmad Mahendra and Salsabil Maulana Akbar and Lester James V. Miranda and Jennifer Santoso and Elyanah Aco and Akhdan Fadhilah and Jonibek Mansurov and Joseph Marvin Imperial and Onno P. Kampman and Joel Ruben Antony Moniz and Muhammad Ravi Shulthan Habibi and Frederikus Hudi and Railey Montalan and Ryan Ignatius and Joanito Agili Lopo and William Nixon and Börje F. Karlsson and James Jaya and Ryandito Diandaru and Yuze Gao and Patrick Amadeus and Bin Wang and Jan Christian Blaise Cruz and Chenxi Whitehouse and Ivan Halim Parmonangan and Maria Khelli and Wenyu Zhang and Lucky Susanto and Reynard Adha Ryanda and Sonny Lazuardi Hermawan and Dan John Velasco and Muhammad Dehan Al Kautsar and Willy Fitra Hendria and Yasmin Moslem and Noah Flynn and Muhammad Farid Adilazuarda and Haochen Li and Johanes Lee and R. Damanhuri and Shuo Sun and Muhammad Reza Qorib and Amirbek Djanibekov and Wei Qi Leong and Quyet V. Do and Niklas Muennighoff and Tanrada Pansuwan and Ilham Firdausi Putra and Yan Xu and Ngee Chia Tai and Ayu Purwarianti and Sebastian Ruder and William Tjhi and Peerat Limkonchotiwat and Alham Fikri Aji and Sedrick Keh and Genta Indra Winata and Ruochen Zhang and Fajri Koto and Zheng-Xin Yong and Samuel Cahyawijaya}, year={2024}, eprint={2406.10118}, journal={arXiv preprint arXiv: 2406.10118} }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作