five

liaad/translation_sample_lid

收藏
Hugging Face2024-01-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/liaad/translation_sample_lid
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: ai2_arc features: - name: question dtype: string - name: question_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: choices sequence: string - name: choices_translated list: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 809 num_examples: 1 download_size: 11996 dataset_size: 809 - config_name: boolq features: - name: question dtype: string - name: question_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: passage dtype: string - name: passage_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 1386 num_examples: 1 download_size: 17972 dataset_size: 1386 - config_name: gsm8k features: - name: question dtype: string - name: question_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: answer dtype: string - name: answer_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 2297 num_examples: 1 download_size: 24008 dataset_size: 2297 - config_name: mbpp features: - name: text dtype: string - name: text_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 382 num_examples: 1 download_size: 6927 dataset_size: 382 - config_name: natural_questions_parsed features: - name: document dtype: string - name: document_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: question dtype: string - name: question_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: candidates sequence: string - name: candidates_translated list: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: long_answer dtype: string - name: long_answer_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 5543 num_examples: 1 download_size: 47553 dataset_size: 5543 - config_name: openbookqa features: - name: question_stem dtype: string - name: question_stem_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: choices sequence: string - name: choices_translated list: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: fact1 dtype: string - name: fact1_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 920 num_examples: 1 download_size: 16942 dataset_size: 920 - config_name: quac features: - name: background dtype: string - name: background_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: context dtype: string - name: context_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: questions sequence: string - name: questions_translated list: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: orig_answers sequence: string - name: orig_answers_translated list: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 11406 num_examples: 1 download_size: 85011 dataset_size: 11406 - config_name: social_i_qa features: - name: context dtype: string - name: context_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: question dtype: string - name: question_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: answerA dtype: string - name: answerA_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: answerB dtype: string - name: answerB_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: answerC dtype: string - name: answerC_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 797 num_examples: 1 download_size: 25730 dataset_size: 797 - config_name: squad_v1_pt features: - name: context dtype: string - name: context_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: question dtype: string - name: question_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: answers sequence: string - name: answers_translated list: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 1659 num_examples: 1 download_size: 24226 dataset_size: 1659 - config_name: winogrande features: - name: sentence dtype: string - name: sentence_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: option1 dtype: string - name: option1_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: option2 dtype: string - name: option2_translated struct: - name: Helsinki-NLP/opus-mt-tc-big-en-pt struct: - name: prediction dtype: float64 - name: text dtype: string - name: google_translation struct: - name: prediction dtype: float64 - name: text dtype: string - name: libre_translation struct: - name: prediction dtype: float64 - name: text dtype: string splits: - name: train num_bytes: 749 num_examples: 1 download_size: 17465 dataset_size: 749 configs: - config_name: ai2_arc data_files: - split: train path: ai2_arc/train-* - config_name: boolq data_files: - split: train path: boolq/train-* - config_name: gsm8k data_files: - split: train path: gsm8k/train-* - config_name: mbpp data_files: - split: train path: mbpp/train-* - config_name: natural_questions_parsed data_files: - split: train path: natural_questions_parsed/train-* - config_name: openbookqa data_files: - split: train path: openbookqa/train-* - config_name: quac data_files: - split: train path: quac/train-* - config_name: social_i_qa data_files: - split: train path: social_i_qa/train-* - config_name: squad_v1_pt data_files: - split: train path: squad_v1_pt/train-* - config_name: winogrande data_files: - split: train path: winogrande/train-* ---
提供机构:
liaad
原始信息汇总

数据集概述

数据集配置

ai2_arc

  • 特征:
    • question: 类型为 string
    • question_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • choices: 类型为 sequence 的字符串
    • choices_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 809,示例数为 1
  • 下载大小: 11996 字节
  • 数据集大小: 809 字节

boolq

  • 特征:
    • question: 类型为 string
    • question_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • passage: 类型为 string
    • passage_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 1386,示例数为 1
  • 下载大小: 17972 字节
  • 数据集大小: 1386 字节

gsm8k

  • 特征:
    • question: 类型为 string
    • question_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • answer: 类型为 string
    • answer_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 2297,示例数为 1
  • 下载大小: 24008 字节
  • 数据集大小: 2297 字节

mbpp

  • 特征:
    • text: 类型为 string
    • text_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 382,示例数为 1
  • 下载大小: 6927 字节
  • 数据集大小: 382 字节

natural_questions_parsed

  • 特征:
    • document: 类型为 string
    • document_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • question: 类型为 string
    • question_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • candidates: 类型为 sequence 的字符串
    • candidates_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • long_answer: 类型为 string
    • long_answer_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 5543,示例数为 1
  • 下载大小: 47553 字节
  • 数据集大小: 5543 字节

openbookqa

  • 特征:
    • question_stem: 类型为 string
    • question_stem_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • choices: 类型为 sequence 的字符串
    • choices_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • fact1: 类型为 string
    • fact1_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 920,示例数为 1
  • 下载大小: 16942 字节
  • 数据集大小: 920 字节

quac

  • 特征:
    • background: 类型为 string
    • background_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • context: 类型为 string
    • context_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • questions: 类型为 sequence 的字符串
    • questions_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • orig_answers: 类型为 sequence 的字符串
    • orig_answers_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 11406,示例数为 1
  • 下载大小: 85011 字节
  • 数据集大小: 11406 字节

social_i_qa

  • 特征:
    • context: 类型为 string
    • context_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • question: 类型为 string
    • question_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • answerA: 类型为 string
    • answerA_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • answerB: 类型为 string
    • answerB_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • answerC: 类型为 string
    • answerC_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 797,示例数为 1
  • 下载大小: 25730 字节
  • 数据集大小: 797 字节

squad_v1_pt

  • 特征:
    • context: 类型为 string
    • context_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • question: 类型为 string
    • question_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • answers: 类型为 sequence 的字符串
    • answers_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 1659,示例数为 1
  • 下载大小: 24226 字节
  • 数据集大小: 1659 字节

winogrande

  • 特征:
    • sentence: 类型为 string
    • sentence_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • option1: 类型为 string
    • option1_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
    • option2: 类型为 string
    • option2_translated: 包含多个翻译模型结果,每个模型结果包含 prediction(类型为 float64)和 text(类型为 string
  • 分割:
    • train: 字节数为 749,示例数为 1
  • 下载大小: 17465 字节
  • 数据集大小: 749 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作