five

nlp-vtcc/codex-math-en

收藏
Hugging Face2023-11-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/nlp-vtcc/codex-math-en
下载链接
链接失效反馈
官方服务:
资源简介:
```py import g4f from copy import deepcopy from datasets import load_dataset translate_prompt = ( "Translate the following python snippet code into Vietnamese language (tiếng Việt). " "Only translate the comments while preserving the name of functions, variables and other code. " "Your translations must convey all the content in the original text and cannot involve explanations or other unnecessary information. " "Please ensure that the translated text is natural for native speakers with correct grammar and proper word choices. " "Your translation must also use exact terminology to provide accurate information even for the experts in the related fields. " "Your output must only contain the code with translated comments and cannot include explanations or other information. " "NOTE: Only translate the comments and DO NOT translate the name of functions, variables, arguments and other code. " "Python code:\n" ) def translate_response(example): reply = example["reply"] text = f"{translate_prompt}{reply}" success = False # try: response = g4f.ChatCompletion.create( model="gpt-3.5-turbo", provider=g4f.Provider.GPTalk, messages=[{"role": "user", "content": text}], stream=False, ) success = True # except: # response = text # success = False # print(f">>> Fail at {text}") new_example = deepcopy(example) new_example["reply"] = response new_example["success"] = success return new_example ## USAGE dataset = load_dataset("json", data_files="codex00", split="train") example = dataset[32] new_example = translate_response(example) print(new_example) ```
提供机构:
nlp-vtcc
原始信息汇总

数据集概述

数据集加载

  • 加载方式: 使用 load_dataset 函数从 "json" 文件中加载数据集。
  • 文件名: "codex00"
  • 数据集类型: 训练集(split="train")

数据处理函数

  • 函数名: translate_response
  • 功能: 将Python代码中的注释翻译成越南语(tiếng Việt)。
  • 输入: 数据集中的一个示例(example)。
  • 输出: 包含翻译后注释的新示例(new_example)。

具体步骤

  1. 生成翻译请求文本: 将原始代码中的注释提取并生成翻译请求文本。
  2. 调用翻译API: 使用 g4f.ChatCompletion.create 函数调用翻译API进行翻译。
  3. 处理响应: 将翻译后的注释替换原始注释,并标记翻译是否成功。

示例代码

python dataset = load_dataset("json", data_files="codex00", split="train") example = dataset[32] new_example = translate_response(example) print(new_example)

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作