sts17-crosslingual-sts

Name: sts17-crosslingual-sts
Creator: maas
Published: 2025-11-14 21:09:33
License: 暂无描述

魔搭社区2025-11-14 更新2025-05-10 收录

下载链接：

https://modelscope.cn/datasets/MTEB/sts17-crosslingual-sts

下载链接

链接失效反馈

官方服务：

资源简介：

<div align="center" style="padding: 40px 20px; background-color: white; border-radius: 12px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.05); max-width: 600px; margin: 0 auto;"> <h1 style="font-size: 3.5rem; color: #1a1a1a; margin: 0 0 20px 0; letter-spacing: 2px; font-weight: 700;">STS17</h1> <div style="font-size: 1.5rem; color: #4a4a4a; margin-bottom: 5px; font-weight: 300;">An <a href="https://github.com/embeddings-benchmark/mteb" style="color: #2c5282; font-weight: 600; text-decoration: none;" onmouseover="this.style.textDecoration='underline'" onmouseout="this.style.textDecoration='none'">MTEB</a> dataset</div> <div style="font-size: 0.9rem; color: #2c5282; margin-top: 10px;">Massive Text Embedding Benchmark</div> </div> Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation | | | |---------------|---------------------------------------------| | Task category | t2t | | Domains | News, Web, Written | | Reference | https://alt.qcri.org/semeval2017/task1/ | ## How to evaluate on this task You can evaluate an embedding model on this dataset using the following code: ```python import mteb task = mteb.get_tasks(["STS17"]) evaluator = mteb.MTEB(task) model = mteb.get_model(YOUR_MODEL) evaluator.run(model) ```  To learn more about how to run models on `mteb` task check out the [GitHub repitory](https://github.com/embeddings-benchmark/mteb). ## Citation If you use this dataset, please cite the dataset as well as [mteb](https://github.com/embeddings-benchmark/mteb), as this dataset likely includes additional processing as a part of the [MMTEB Contribution](https://github.com/embeddings-benchmark/mteb/tree/main/docs/mmteb). ```bibtex @inproceedings{cer-etal-2017-semeval, abstract = {Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in \textit{all language tracks}. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the \textit{STS Benchmark} is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017).}, address = {Vancouver, Canada}, author = {Cer, Daniel and Diab, Mona and Agirre, Eneko and Lopez-Gazpio, I{\\~n}igo and Specia, Lucia}, booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation ({S}em{E}val-2017)}, doi = {10.18653/v1/S17-2001}, editor = {Bethard, Steven and Carpuat, Marine and Apidianaki, Marianna and Mohammad, Saif M. and Cer, Daniel and Jurgens, David}, month = aug, pages = {1--14}, publisher = {Association for Computational Linguistics}, title = {{S}em{E}val-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation}, url = {https://aclanthology.org/S17-2001}, year = {2017}, } @article{enevoldsen2025mmtebmassivemultilingualtext, title={MMTEB: Massive Multilingual Text Embedding Benchmark}, author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff}, publisher = {arXiv}, journal={arXiv preprint arXiv:2502.13595}, year={2025}, url={https://arxiv.org/abs/2502.13595}, doi = {10.48550/arXiv.2502.13595}, } @article{muennighoff2022mteb, author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Lo{\"\i}c and Reimers, Nils}, title = {MTEB: Massive Text Embedding Benchmark}, publisher = {arXiv}, journal={arXiv preprint arXiv:2210.07316}, year = {2022} url = {https://arxiv.org/abs/2210.07316}, doi = {10.48550/ARXIV.2210.07316}, } ``` # Dataset Statistics <details> <summary> Dataset Statistics</summary> The following code contains the descriptive statistics from the task. These can also be obtained using: ```python import mteb task = mteb.get_task("STS17") desc_stats = task.metadata.descriptive_stats ``` ```json { "test": { "num_samples": 5346, "number_of_characters": 400264, "min_sentence1_length": 6, "average_sentence1_len": 38.14665170220726, "max_sentence1_length": 976, "unique_sentence1": 4900, "min_sentence2_length": 6, "average_sentence2_len": 36.72502805836139, "max_sentence2_length": 1007, "unique_sentence2": 4470, "min_score": 0.0, "avg_score": 2.3554804214989464, "max_score": 5.0 } } ``` </details> --- *This dataset card was automatically generated using [MTEB](https://github.com/embeddings-benchmark/mteb)*

<div align="center" style="padding: 40px 20px; background-color: #ffffff; border-radius: 12px; box-shadow: 0 2px 10px rgba(0, 0, 0, 0.05); max-width: 600px; margin: 0 auto;"> <h1 style="font-size: 3.5rem; color: #1a1a1a; margin: 0 0 20px 0; letter-spacing: 2px; font-weight: 700;">STS17</h1> <div style="font-size: 1.5rem; color: #4a4a4a; margin-bottom: 5px; font-weight: 300;">一款<a href="https://github.com/embeddings-benchmark/mteb" style="color: #2c5282; font-weight: 600; text-decoration: none;" onmouseover="this.style.textDecoration='underline'" onmouseout="this.style.textDecoration='none'">MTEB（大规模文本嵌入基准，Massive Text Embedding Benchmark）</a>数据集</div> <div style="font-size: 0.9rem; color: #2c5282; margin-top: 10px;">大规模文本嵌入基准（Massive Text Embedding Benchmark）</div> </div> SemEval-2017任务1：语义文本相似度——聚焦多语言与跨语言的评测 | | | |---------------|---------------------------------------------| | 任务类别 | 文本到文本（t2t） | | 领域 | 新闻、网页、书面文本 | | 参考链接 | https://alt.qcri.org/semeval2017/task1/ | ## 本任务评测指南您可通过以下代码在该数据集上评测嵌入模型： python import mteb task = mteb.get_tasks(["STS17"]) evaluator = mteb.MTEB(task) model = mteb.get_model(YOUR_MODEL) evaluator.run(model)  若需了解更多在`MTEB`任务上运行模型的方法，请访问其[GitHub仓库](https://github.com/embeddings-benchmark/mteb)。 ## 引用若您使用本数据集，请同时引用该数据集与[MTEB](https://github.com/embeddings-benchmark/mteb)，因本数据集作为[MMTEB（大规模多语言文本嵌入基准）贡献项目](https://github.com/embeddings-benchmark/mteb/tree/main/docs/mmteb)的一部分经过了额外处理。 bibtex @inproceedings{cer-etal-2017-semeval, abstract = {语义文本相似度（Semantic Textual Similarity, STS）用于衡量句子间的语义相似度。其应用场景涵盖机器翻译（MT）、文本摘要、内容生成、问答（QA）、短答案评分、语义搜索、对话与交互系统等。STS共享任务是评估当前技术水平的平台。2017年的任务聚焦于多语言与跨语言样本对，并设有一个子赛道探索机器翻译质量评估（MTQE）数据。本次任务共有31支队伍参赛，其中17支队伍参与了全部语言赛道。本文总结了赛事表现并回顾了若干优秀参赛方法。分析部分指出了常见错误，为现有模型的局限性提供了见解。为支持语义表示领域的持续研究，本文引入了《STS基准》作为全新的共享训练与评测集，该数据集从2012-2017年的英文STS共享任务数据集中精心筛选而来。}, address = {加拿大温哥华}, author = {Cer, Daniel 及 Diab, Mona 及 Agirre, Eneko 及 Lopez-Gazpio, Iñigo 及 Specia, Lucia}, booktitle = {第11届国际语义评测大会（SemEval-2017）论文集}, doi = {10.18653/v1/S17-2001}, editor = {Bethard, Steven 及 Carpuat, Marine 及 Apidianaki, Marianna 及 Mohammad, Saif M. 及 Cer, Daniel 及 Jurgens, David}, month = aug, pages = {1--14}, publisher = {计算语言学协会}, title = {{S}em{E}val-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation}, url = {https://aclanthology.org/S17-2001}, year = {2017}, } @article{enevoldsen2025mmtebmassivemultilingualtext, title={MMTEB: 大规模多语言文本嵌入基准}, author={Kenneth Enevoldsen 及 Isaac Chung 及 Imene Kerboua 及 Márton Kardos 及 Ashwin Mathur 及 David Stap 及 Jay Gala 及 Wissam Siblini 及 Dominik Krzemiński 及 Genta Indra Winata 及 Saba Sturua 及 Saiteja Utpala 及 Mathieu Ciancone 及 Marion Schaeffer 及 Gabriel Sequeira 及 Diganta Misra 及 Shreeya Dhakal 及 Jonathan Rystrøm 及 Roman Solomatin 及 Ömer Çağatan 及 Akash Kundu 及 Martin Bernstorff 及 Shitao Xiao 及 Akshita Sukhlecha 及 Bhavish Pahwa 及 Rafał Poświata 及 Kranthi Kiran GV 及 Shawon Ashraf 及 Daniel Auras 及 Björn Plüster 及 Jan Philipp Harries 及 Loïc Magne 及 Isabelle Mohr 及 Mariya Hendriksen 及 Dawei Zhu 及 Hippolyte Gisserot-Boukhlef 及 Tom Aarsen 及 Jan Kostkan 及 Konrad Wojtasik 及 Taemin Lee 及 Marek Šuppa 及 Crystina Zhang 及 Roberta Rocca 及 Mohammed Hamdy 及 Andrianos Michail 及 John Yang 及 Manuel Faysse 及 Aleksei Vatolin 及 Nandan Thakur 及 Manan Dey 及 Dipam Vasani 及 Pranjal Chitale 及 Simone Tedeschi 及 Nguyen Tai 及 Artem Snegirev 及 Michael Günther 及 Mengzhou Xia 及 Weijia Shi 及 Xing Han Lù 及 Jordan Clive 及 Gayatri Krishnakumar 及 Anna Maksimova 及 Silvan Wehrli 及 Maria Tikhonova 及 Henil Panchal 及 Aleksandr Abramov 及 Malte Ostendorff 及 Zheng Liu 及 Simon Clematide 及 Lester James Miranda 及 Alena Fenogenova 及 Guangyu Song 及 Ruqiya Bin Safi 及 Wen-Ding Li 及 Alessia Borghini 及 Federico Cassano 及 Hongjin Su 及 Jimmy Lin 及 Howard Yen 及 Lasse Hansen 及 Sara Hooker 及 Chenghao Xiao 及 Vaibhav Adlakha 及 Orion Weller 及 Siva Reddy 及 Niklas Muennighoff}, publisher = {arXiv}, journal={arXiv预印本 arXiv:2502.13595}, year={2025}, url={https://arxiv.org/abs/2502.13595}, doi = {10.48550/arXiv.2502.13595}, } @article{muennighoff2022mteb, author = {Muennighoff, Niklas 及 Tazi, Nouamane 及 Magne, Loïc 及 Reimers, Nils}, title = {MTEB: 大规模文本嵌入基准}, publisher = {arXiv}, journal={arXiv预印本 arXiv:2210.07316}, year = {2022} url = {https://arxiv.org/abs/2210.07316}, doi = {10.48550/ARXIV.2210.07316}, } ## 数据集统计 <details> <summary>数据集统计</summary> 以下为该任务的描述性统计信息。您也可通过以下代码获取相关统计数据： python import mteb task = mteb.get_task("STS17") desc_stats = task.metadata.descriptive_stats json { "test": { "样本数量": 5346, "总字符数": 400264, "句子1最小长度": 6, "句子1平均长度": 38.14665170220726, "句子1最大长度": 976, "句子1唯一样本数": 4900, "句子2最小长度": 6, "句子2平均长度": 36.72502805836139, "句子2最大长度": 1007, "句子2唯一样本数": 4470, "最低相似度分数": 0.0, "平均相似度分数": 2.3554804214989464, "最高相似度分数": 5.0 } } </details> --- *本数据集卡片由[MTEB](https://github.com/embeddings-benchmark/mteb)自动生成*

提供机构：

maas

创建时间：

2024-09-06

搜集汇总

数据集介绍