Datasets for "ESNLIR: A Spanish Multi-Genre Dataset with Causal Relationships"

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/15002370

下载链接

链接失效反馈

官方服务：

资源简介：

ESNLIR: A Spanish Multi-Genre Dataset with Causal Relationships These are the datasets for the paper ESNLIR: A Spanish Multi-Genre Dataset with Causal Relationships. Dataset dictionary This repository contains the splits that resulted from the research project "ESNLIR: A Spanish Multi-Genre Dataset with Causal Relationships". All the splits are in JSONL format and have the same fields per example: sentence_1: First sentence of the pair. sentence_2: Second sentence of the pair. connector: Linking phrase used to extract pair. connector_type: NLI label, between "contrasting", "entailment", "reasoning" or "neutral" extraction_strategy: "linking_phrase" for "contrasting", "entailment", "reasoning" and "none" for neutral. distance: How many sentences before the connector is the sentence_1 sentence_1_position: Number of sentence for sentence_1 in the source document sentence_1_paragraph: Number of paragraph for sentence_1 in the source document sentence_2_position: Number of sentence for sentence_2 in the source document sentence_2_paragraph: Number of paragraph for sentence_2 in the source document id: Unique identifier for the example dataset: Source corpus of the pair. Metadata of corpus, including source can be found in dataset_metadata.xlsx. genre: Writing genre of the dataset. domain: Domain genre of the dataset. Example: {"sentence_1":"sefior Bcajavides no es moderado, tampoco lo convertirse e\u00f1 declarada divergencia de miras polileido en griego","sentence_2":"era mayor claricomentarios, as\u00ed de los peri\u00f3dicos como de los homes dado \u00e1 la voluntad de los hombres, sin que sobreticas","connector":"por consiguiente,","connector_type":"reasoning","extraction_strategy":"linking_phrase","distance":1.0,"sentence_1_paragraph":4,"sentence_1_position":86,"sentence_2_paragraph":4,"sentence_2_position":87,"id":"esnews__spanish_pd_news__531537","dataset":"esnews__spanish_pd_news","genre":"news","domain":"spanish_public_domain_news"} Dataset files ESNLIR_datasets.zip: Contains the splits used for BERT-based model training, validation and testing, including stress test splits. labeled_final_dataset.jsonl: Is the final test dataset with 974 examples selected by human majority label matching the original linking phrase label.

创建时间：

2025-03-13