sapienzanlp/wic

Name: sapienzanlp/wic
Creator: sapienzanlp
Published: 2024-09-22 18:25:55
License: 暂无描述

Hugging Face2024-09-22 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/sapienzanlp/wic

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: lemma dtype: string - name: sentence1 dtype: string - name: sentence2 dtype: string - name: start1 dtype: int64 - name: end1 dtype: int64 - name: start2 dtype: int64 - name: end2 dtype: int64 - name: label dtype: int64 splits: - name: train num_bytes: 1128581 num_examples: 2805 - name: validation num_bytes: 198885 num_examples: 500 - name: test num_bytes: 199696 num_examples: 500 download_size: 1012507 dataset_size: 1527162 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - split: test path: data/test-* --- # Word in Context (WIC) Original Paper: https://wic-ita.github.io/ This dataset comes from EVALITA-2023. Word in Context task consists of establishing if a word *w* occurring in two different sentences *s1* and *s2* has the same meaning or not. We repropose this task to test generative LLMs defining a specific prompting strategy comparing the perplexities of possible continuations to understand the models' capabilities. ## Example Here you can see the structure of the single sample in the present dataset. ```json { "sentence_1": string, # text of the sentence 1 "sentence_2": string, # text of the sentence 2 "lemma": string, # text of the word present in both sentences "label": int, # 0: Different Mearning, 1: Same Meaning, } ``` ## Statistics | WIC | 0 | 1 | | :--------: | :----: | :----: | | Training | 806 | 1999 | | Validation | 250 | 250 | | Test | 250 | 250 | ## Proposed Prompts Here we will describe the prompt given to the model over which we will compute the perplexity score, as model's answer we will chose the prompt with lower perplexity. Moreover, for each subtask, we define a description that is prepended to the prompts, needed by the model to understand the task. Description of the task: "Date due frasi, che contengono un lemma in comune, indica se tale lemma ha lo stesso significato in entrambe le frasi.\n\n" ### Cloze Style: Label (**Different Meaning**): "Frase 1: {{sentence1}}\nFrase 2: {{sentence2}}\nLa parola '{{lemma}}' nelle due frasi precedenti ha un significato differente tra le due frasi" Label (**Same Meaning**): "Frase 1: {{sentence1}}\nFrase 2: {{sentence2}}\nLa parola '{{lemma}}' nelle due frasi precedenti ha lo stesso significato in entrambe le frasi" ### MCQA Style: ```txt Frase 1: {{sentence1}}\nFrase 2: {{sentence2}}\nDomanda: La parola \"{{lemma}}\" ha lo stesso signicato nelle due frasi precedenti? Rispondi sì o no: ``` ## Results The following results are given by the Cloze-style prompting over some english and italian-adapted LLMs. | WIC | ACCURACY (5-shots) | | :-----: | :--: | | Gemma-2B | 48.2 | | QWEN2-1.5B | 50.4 | | Mistral-7B | 53.4 | | ZEFIRO | 54.6 | | Llama-3-8B | 54.6 | | Llama-3-8B-IT | 62.8 | | ANITA | 69.2 | ## Acknowledge We would like to thank the authors of this resource for publicly releasing such an intriguing benchmark. Additionally, we extend our gratitude to the students of the [MNLP-2024 course](https://naviglinlp.blogspot.com/), whose first homework explored various interesting prompting strategies. The original dataset is freely available for download [link](https://github.com/wic-ita/data). ## License Original data license not found.

提供机构：

sapienzanlp

5,000+

优质数据集

54 个

任务类型

进入经典数据集