swap-uniba/EVWSD-ITA-eval
收藏Hugging Face2026-03-27 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/swap-uniba/EVWSD-ITA-eval
下载链接
链接失效反馈官方服务:
资源简介:
This is the evaluation set for the EVWSD-ITA EVALITA 2026 task.
Each instance contains a "query" (manually written following the procedure outlined on the main site) and a list of candidate images (image paths to the images directory). We also upload the images as zip file.
Some notes:
- Queries and images have been manually checked to increase robustness of the data
- For synsets where the sum of the co-Hyponyms and the synsets with the same lemma was less than 9, we have sampled other candidates from other instances within the test set (so that there were a total of 10 possible images for each instance). Sampling was done by leveraging cosine similarity of vision embeddings, so to choose images that were still challenging to disambiguate w.r.t. the target image
- The test set consists of 222 instances in total
- We will be releasing baseline results using a [Multilingual CLIP](https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1) model
# UPDATE
The dataset has been updated following a warning regarding the quality of queries with compound words as lemmas (shuffled in a non-meaningful way). Please refer to the latest version of the dataset.
# Submission Guidelines
EVALITA 2026 has concluded. Submission guidelines have been removed. We have also uploaded an additional file to this repo ("ds\_test\_anon\_with\_labels.json") which contains the labels for the instances.
# Dataset Format
Example:
{ "query": "patologia calcolo organo", "label": "1504", "candidates": ["1133.jpg", "850.jpg", "743.jpg", "1266.jpg", "1367.jpg", "948.jpg", "549.jpg", "695.jpg", "1504.jpg", "1148.jpg"]}
Fields:
- "query" contains the textual query constructed following the methodology explained in the paper;
- "label" contains the ID of the image that represents the query. Add a trailing ".jpg" string to obtain the filename;
- "candidates" list of candidate images to rank w.r.t. the query
# Baseline Results
The baseline is a [Multilingual CLIP](https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1) model. We used the pre-trained model without further fine-tuning.
```
HIT@1: 0.4505
MRR: 0.6244
```
提供机构:
swap-uniba



