five

BLM-CausI-Gen (Blackbird Language Matrices Causative/Inchoative Alternation in Italian)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/14011546
下载链接
链接失效反馈
官方服务:
资源简介:
Description BLM-CausI-Gen is a dataset in Italian for learning the underlying rules of causative/inchoative alternation in sentences, developed in the Blackbird Language Matrices (BLM) framework (this dataset is a subset from the training data of https://www.idiap.ch/dataset/BLM-CausI). In this task, an instance consists of a sequence of sentences with specific attributes. To predict the correct answer as the next element of the sequence, a model must correctly detect the underlying set of generative rules used to produce the dataset. An instance represents the causative/inchoative alternation, where the object of the transitive verb bears the same semantic role (Patient) as the subject of the intransitive verb. The transitive form of the verb has a causative meaning. Blackbird Language Matrices (BLMs) are multiple-choice problems, where the input is a sequence of sentences built using specific generating rules, and the answer set consists of a correct answer that continues the input sequence, and several incorrect contrastive options, built by violating the underlying generating rules of the sentences. In a BLM matrix, all sentences share the targeted linguistic phenomenon (in this case causative/inchoative alternation), but differ in other aspects relevant for the phenomenon in question.  The BLM-CausI-Gen is one of the six sub-tasks of the BLM-It challenge. All sub-tasks are instances of the general BLM task, but they differ along two dimensions: the linguistic problem defined (Agr, Caus, Od) and the lexical complexity of the data (II, III)1.   The data comes grouped by lexical variation (i.e. type II/III) and each subset is split into train/test. The statistics of the current iteration of the dataset are (train:test split information): type II 80:2080  type III  80:2080   Reference If you use this dataset,please cite the following publication: Jiang, Chunyang & Samo, Giuseppe & Nastase, Vivi & Merlo, Paola. (2024). BLM-It — Blackbird Language Matrices for Italian: A CALAMITA Challenge. (TO APPEAR)
创建时间:
2024-10-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作