five

BLM-OdI (Blackbird Language Matrices Object Drop verb alternations in Italian)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/14011394
下载链接
链接失效反馈
官方服务:
资源简介:
Description BLM-OdI is an Object-Drop (OD) alternation dataset for testing lexical semantic properties of verbs, their ability to enter or not a causative alternation. The subject in OD bears the same semantic role (Agent) in both the transitive and intransitive forms (L’artista dipingeva la finestra/L’artista dipingeva the artist painted the window’/‘the artist painted’) and the verb does not have a causative meaning. Blackbird Language Matrices (BLMs) are multiple-choice problems, where the input is a sequence of sentences built using specific generative rules, and the answer set consists of a correct answer that continues the input sequence, and several incorrect contrastive options. The contrastive options are built by violating the underlying generating rules of the sentences. In a  BLM matrix, all sentences share the targeted linguistic phenomenon (in this case verb alternations), but differ in other aspects relevant for the phenomenon in question.    BLM datasets also have a lexical variation dimension, to explore the impact of lexical variation on detecting relevant structures: type I – minimal lexical variation for sentences within an instance, type II – one word difference across the sentences within an instance, type III – maximal lexical variation within an instance.  The data comes grouped by lexical variation (i.e. type I/II/III) and each subset is split into train/test. Each split contains 2140 training and 240 testing instances.   Reference If you use this dataset,please cite the following publication: Nastase, Vivi& Samo, Giuseppe & Jiang, Chunyang & Merlo, Paola. (2024). Exploring Italian sentence embeddings properties through multi-tasking. DOI: 10.48550/arXiv.2409.06622.
创建时间:
2024-10-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作