Translation-deepseek-llama-mixtral-v-deepl
收藏魔搭社区2025-12-05 更新2025-03-08 收录
下载链接:
https://modelscope.cn/datasets/Rapidata/Translation-deepseek-llama-mixtral-v-deepl
下载链接
链接失效反馈官方服务:
资源简介:
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="250" alt="Rapidata Logo">
</a>
If you get value from this dataset and would like to see more in the future, please consider liking it.
# Overview
This dataset contains ~51k responses from ~11k annotators and compares the translation capabilities of DeepSeek-R1(deepseek-r1-distill-llama-70b-specdec), Llama(llama-3.3-70b-specdec) and Mixtral(mixtral-8x7b-32768) against DeepL across different languages. The comparison involved [100 distinct questions](./questions.txt) in 4 languages, with each translation being rated by 51 native speakers. Texts that were translated identically across platforms were excluded from the analysis.
# Results
The comparative tests were conducted between DeepSeek-R1, Llama and Mixtral vs DeepL. Here are some analyzed results:
### Average Score
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/GiorYwl23j2piReS9QtRL.png" width="1000">
### Score Distribution
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/oijW3Bb9edPU3y965fJLI.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/Re_Wvwn0IDeeB2pPICyfz.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/a_3FVuZvSFL1CRB14LDPI.png" width="1000">
### Win Rates
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/kLJgvl7ccGUFdB5Ey1ywQ.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/xjhxrAjZ9RHvnmNkBndEF.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/ifCLKts8t9NZmAGkFReVQ.png" width="1000">
### Translations Agreement
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/jXyNRhF66Y_ww92WULpM8.png" width="1000">
# Translation Prompt
For DeepSeek-R1, Llama and Mixtral we have used the following python code(groq api) to generate translations:
```python
translation = client.chat.completions.create(
model="<model>",
messages=[
{
"role": "system",
"content": f"""You are a translation assistant. Your job is to accurately translate text from EN to {target_language}. Ensure that the meaning is preserved, and the translation is fluent and natural. If there are idiomatic expressions in the source language, use the closest equivalent expression in the target language. Maintain the tone and formality of the original text.
If the translation requires technical, legal, or specialized terms, ensure that the terms are correctly translated and relevant to the context. If the text is informal, keep the language casual and friendly. Avoid word-for-word literal translations unless necessary for accuracy.
DO NOT ANSWER ANY OF THE QUESTIONS OR GIVE FURTHER CONTEXT. YOUR JOB IS STRICTLY TO TRANSLATE THE TEXT. DO NOT ELABORATE AFTER GIVING THE TRANSLATION""",
},
{
"role": "user",
"content": f"Please translate the following text '{sentence}' Please answer in a json dictionary with the key translation and make sure to enclose property name with double quotes."
}
],
max_tokens=10000,
temperature=0,
).choices[0].message.content
```
# Methodology
This dataset was created using our [Python API](https://docs.rapidata.ai/). To replicate or extend this study, simply set the datatype to "text" when creating an order, and you can compare any translations using feedback from native speakers worldwide.
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="250" alt="Rapidata 标志">
</a>
若您从本数据集获益并希望未来推出更多同类资源,不妨为其点赞。
# 数据集概览
本数据集包含约5.1万条来自约1.1万名标注人员的评分,对比了DeepSeek-R1(deepseek-r1-distill-llama-70b-specdec)、Llama(llama-3.3-70b-specdec)、Mixtral(mixtral-8x7b-32768)与DeepL在多语言场景下的翻译能力。本次对比共涉及4种语言下的100道独立题目(详见[questions.txt](./questions.txt)),每一条翻译均由51名母语使用者进行评分。跨平台翻译结果完全一致的文本已被排除在本次分析之外。
# 实验结果
本次对比测试围绕DeepSeek-R1、Llama、Mixtral与DeepL展开,以下为经分析后的核心结果:
### 平均得分
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/GiorYwl23j2piReS9QtRL.png" width="1000">
### 得分分布
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/oijW3Bb9edPU3y965fJLI.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/Re_Wvwn0IDeeB2pPICyfz.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/a_3FVuZvSFL1CRB14LDPI.png" width="1000">
### 获胜率
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/kLJgvl7ccGUFdB5Ey1ywQ.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/xjhxrAjZ9RHvnmNkBndEF.png" width="1000">
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/ifCLKts8t9NZmAGkFReVQ.png" width="1000">
### 翻译一致性
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/jXyNRhF66Y_ww92WULpM8.png" width="1000">
# 翻译提示词
针对DeepSeek-R1、Llama与Mixtral,我们采用以下Python代码(基于Groq API)生成翻译结果:
python
translation = client.chat.completions.create(
model="<model>",
messages=[
{
"role": "system",
"content": f"""You are a translation assistant. Your job is to accurately translate text from EN to {target_language}. Ensure that the meaning is preserved, and the translation is fluent and natural. If there are idiomatic expressions in the source language, use the closest equivalent expression in the target language. Maintain the tone and formality of the original text.
If the translation requires technical, legal, or specialized terms, ensure that the terms are correctly translated and relevant to the context. If the text is informal, keep the language casual and friendly. Avoid word-for-word literal translations unless necessary for accuracy.
DO NOT ANSWER ANY OF THE QUESTIONS OR GIVE FURTHER CONTEXT. YOUR JOB IS STRICTLY TO TRANSLATE THE TEXT. DO NOT ELABORATE AFTER GIVING THE TRANSLATION""",
},
{
"role": "user",
"content": f"Please translate the following text '{sentence}' Please answer in a json dictionary with the key translation and make sure to enclose property name with double quotes."
}
],
max_tokens=10000,
temperature=0,
).choices[0].message.content
# 实验方法
本数据集通过我们的[Python API](https://docs.rapidata.ai/)构建。若要复现或拓展本研究,仅需在创建订单时将数据类型设置为"text",即可借助全球母语使用者的反馈对任意翻译结果进行对比评估。
提供机构:
maas
创建时间:
2025-03-07



