five

anotherdev/testing-datasets

收藏
Hugging Face2024-01-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/anotherdev/testing-datasets
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: - other pretty_name: >- testing datasets in a sandbox this is not a real dataset it is sandbox for testing size_categories: - 0<n<1k tags: - other # supported task_categories # text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, conversational, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, other task_categories: - other # supported task_ids # acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, image-inpainting, image-colorization, super-resolution, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering task_ids: - parsing dataset_info: - config_name: audio_class features: - name: file_path dtype: string - name: audio_path dtype: string - name: lang dtype: string - name: dbytes_len dtype: int64 - name: dbytes dtype: binary splits: - name: audio_class # download_size: 1 # dataset_size: 1 - config_name: audio_base features: - name: file_path dtype: string - name: audio_path dtype: string - name: lang dtype: string - name: dbytes_len dtype: int64 - name: dbytes dtype: binary splits: - name: audio_base # download_size: 1 # dataset_size: 1 - config_name: audio_import features: - name: file_path dtype: string - name: audio_path dtype: string - name: lang dtype: string - name: dbytes_len dtype: int64 - name: dbytes dtype: binary splits: - name: audio_import # download_size: 1 # dataset_size: 1 - config_name: audio_function features: - name: file_path dtype: string - name: audio_path dtype: string - name: lang dtype: string - name: dbytes_len dtype: int64 - name: dbytes dtype: binary splits: - name: audio_function # download_size: 1 # dataset_size: 1 - config_name: image_base features: - name: filename dtype: string - name: repo dtype: string - name: path dtype: string - name: dbytes dtype: binary - name: dbytes_len dtype: int64 - name: dbytes_mb dtype: string - name: type dtype: string splits: - name: image_base # download_size: 1 # dataset_size: 1 - config_name: image_import features: - name: filename dtype: string - name: repo dtype: string - name: path dtype: string - name: dbytes dtype: binary - name: dbytes_len dtype: int64 - name: dbytes_mb dtype: string - name: type dtype: string splits: - name: image_import # download_size: 1 # dataset_size: 1 - config_name: image_function features: - name: filename dtype: string - name: repo dtype: string - name: path dtype: string - name: dbytes dtype: binary - name: dbytes_len dtype: int64 - name: dbytes_mb dtype: string - name: type dtype: string splits: - name: image_function # download_size: 1 # dataset_size: 1 - config_name: image_class features: - name: filename dtype: string - name: repo dtype: string - name: path dtype: string - name: dbytes dtype: binary - name: dbytes_len dtype: int64 - name: dbytes_mb dtype: string - name: type dtype: string splits: - name: image_class # download_size: 1 # dataset_size: 1 - config_name: text_instruct # features: #- name: filename # dtype: string #- name: repo # dtype: string #- name: path # dtype: string #- name: dbytes # dtype: binary #- name: dbytes_len # dtype: int64 #- name: dbytes_mb # dtype: string #- name: type # dtype: string splits: - name: text_instruct # download_size: 1 # dataset_size: 1 - config_name: text_python splits: - name: text_python_ai_research - name: text_python_many_repos # download_size: 1 # dataset_size: 1 configs: - config_name: audio_class data_files: - split: audio_class path: files/audio/test-audio-class.parquet - config_name: audio_base data_files: - split: audio_base path: files/audio/test-audio-base.parquet - config_name: audio_import data_files: - split: audio_import path: files/audio/test-audio-import.parquet - config_name: audio_function data_files: - split: audio_function path: files/audio/test-audio-function.parquet - config_name: image_base data_files: - split: image_base path: files/image/test-image-base.parquet - config_name: image_import data_files: - split: image_import path: files/image/test-image-import.parquet - config_name: image-function data_files: - split: image_function path: files/image/test-image-function.parquet - config_name: image-class data_files: - split: image_class path: files/image/test-image-class.parquet - config_name: text_instruct data_files: - split: text_instruct path: files/instruct/test-text-instruct.parquet - config_name: text_python data_files: - split: text_python_ai_research path: files/text/test-text-python-ai-research.parquet - split: text_python_many_repos path: files/text/test-text-python-many-repos.parquet --- # Testing Datasets ### How to use the dataset ```python from datasets import load_dataset # load audio print("loading audio") ds_audio = load_dataset("anotherdev/testing-datasets", data_dir="files/audio") print(ds_audio) # load image print("loading images") ds_image = load_dataset("anotherdev/testing-datasets", data_dir="files/image") print(ds_image) # load text print("loading text") ds_text = load_dataset("anotherdev/testing-datasets", data_dir="files/text") print(ds_text) # load instruct print("loading instruct") ds_instr = load_dataset("anotherdev/testing-datasets", data_dir="files/instruct") print(ds_instr) ```
提供机构:
anotherdev
原始信息汇总

数据集概述

基本信息

  • 名称: testing datasets in a sandbox
  • 描述: 这不是一个真实的数据集,它是用于测试的沙盒环境。
  • 大小类别: 0<n<1k
  • 标签: other
  • 任务类别: other
  • 任务ID: parsing

配置信息

音频配置

  • config_name: audio_class

    • 特征:
      • file_path: string
      • audio_path: string
      • lang: string
      • dbytes_len: int64
      • dbytes: binary
    • 分割: audio_class
    • 数据文件:
      • split: audio_class
      • path: files/audio/test-audio-class.parquet
  • config_name: audio_base

    • 特征:
      • file_path: string
      • audio_path: string
      • lang: string
      • dbytes_len: int64
      • dbytes: binary
    • 分割: audio_base
    • 数据文件:
      • split: audio_base
      • path: files/audio/test-audio-base.parquet
  • config_name: audio_import

    • 特征:
      • file_path: string
      • audio_path: string
      • lang: string
      • dbytes_len: int64
      • dbytes: binary
    • 分割: audio_import
    • 数据文件:
      • split: audio_import
      • path: files/audio/test-audio-import.parquet
  • config_name: audio_function

    • 特征:
      • file_path: string
      • audio_path: string
      • lang: string
      • dbytes_len: int64
      • dbytes: binary
    • 分割: audio_function
    • 数据文件:
      • split: audio_function
      • path: files/audio/test-audio-function.parquet

图像配置

  • config_name: image_base

    • 特征:
      • filename: string
      • repo: string
      • path: string
      • dbytes: binary
      • dbytes_len: int64
      • dbytes_mb: string
      • type: string
    • 分割: image_base
    • 数据文件:
      • split: image_base
      • path: files/image/test-image-base.parquet
  • config_name: image_import

    • 特征:
      • filename: string
      • repo: string
      • path: string
      • dbytes: binary
      • dbytes_len: int64
      • dbytes_mb: string
      • type: string
    • 分割: image_import
    • 数据文件:
      • split: image_import
      • path: files/image/test-image-import.parquet
  • config_name: image_function

    • 特征:
      • filename: string
      • repo: string
      • path: string
      • dbytes: binary
      • dbytes_len: int64
      • dbytes_mb: string
      • type: string
    • 分割: image_function
    • 数据文件:
      • split: image_function
      • path: files/image/test-image-function.parquet
  • config_name: image_class

    • 特征:
      • filename: string
      • repo: string
      • path: string
      • dbytes: binary
      • dbytes_len: int64
      • dbytes_mb: string
      • type: string
    • 分割: image_class
    • 数据文件:
      • split: image_class
      • path: files/image/test-image-class.parquet

文本配置

  • config_name: text_instruct

    • 分割: text_instruct
    • 数据文件:
      • split: text_instruct
      • path: files/instruct/test-text-instruct.parquet
  • config_name: text_python

    • 分割:
      • text_python_ai_research
      • text_python_many_repos
    • 数据文件:
      • split: text_python_ai_research
      • path: files/text/test-text-python-ai-research.parquet
      • split: text_python_many_repos
      • path: files/text/test-text-python-many-repos.parquet
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作