ftopal/huggingface-models-embeddings
收藏Hugging Face2024-04-23 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/ftopal/huggingface-models-embeddings
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: sha
dtype: 'null'
- name: last_modified
dtype: 'null'
- name: library_name
dtype: string
- name: text
dtype: string
- name: metadata
dtype: string
- name: pipeline_tag
dtype: string
- name: id
dtype: string
- name: tags
sequence: string
- name: created_at
dtype: string
- name: arxiv
sequence: string
- name: languages
sequence: string
- name: tags_str
dtype: string
- name: text_str
dtype: string
- name: text_lists
sequence: string
- name: processed_texts
sequence: string
- name: tokens_length
sequence: int64
- name: input_texts
sequence: string
- name: embeddings
sequence: float32
splits:
- name: train
num_bytes: 2528620129
num_examples: 240530
download_size: 1308461835
dataset_size: 2528620129
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
提供机构:
ftopal
原始信息汇总
数据集概述
数据集特征
- sha:数据类型为
null - last_modified:数据类型为
null - library_name:数据类型为
string - text:数据类型为
string - metadata:数据类型为
string - pipeline_tag:数据类型为
string - id:数据类型为
string - tags:数据类型为
sequence: string - created_at:数据类型为
string - arxiv:数据类型为
sequence: string - languages:数据类型为
sequence: string - tags_str:数据类型为
string - text_str:数据类型为
string - text_lists:数据类型为
sequence: string - processed_texts:数据类型为
sequence: string - tokens_length:数据类型为
sequence: int64 - input_texts:数据类型为
sequence: string - embeddings:数据类型为
sequence: float32
数据集分割
- train:
- 数据量:2528620129 字节
- 示例数量:240530
数据集大小
- 下载大小:1308461835 字节
- 数据集大小:2528620129 字节
配置
- config_name: default
- data_files:
- split: train
- path: data/train-*
- split: train
- data_files:



