DjSteker/alpaca-es-auto-filter
收藏Hugging Face2024-02-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/DjSteker/alpaca-es-auto-filter
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: 'null'
- name: inputs
struct:
- name: 1-instruction
dtype: string
- name: 2-input
dtype: string
- name: 3-output
dtype: string
- name: prediction
dtype: 'null'
- name: prediction_agent
dtype: 'null'
- name: annotation
dtype: string
- name: annotation_agent
dtype: string
- name: vectors
struct:
- name: input
sequence: float64
- name: instruction
sequence: float64
- name: output
sequence: float64
- name: multi_label
dtype: bool
- name: explanation
dtype: 'null'
- name: id
dtype: string
- name: metadata
struct:
- name: bias_score.label
dtype: string
- name: bias_score.score
dtype: float64
- name: en_index
dtype: int64
- name: hate_score.label
dtype: string
- name: hate_score.score
dtype: float64
- name: sf-multi-unprocessable-score
dtype: float64
- name: sf-unprocessable-score
dtype: float64
- name: tr-flag-1-instruction
dtype: bool
- name: tr-flag-2-input
dtype: bool
- name: tr-flag-3-output
dtype: bool
- name: status
dtype: string
- name: event_timestamp
dtype: timestamp[us]
- name: metrics
struct:
- name: text_length
dtype: int64
splits:
- name: train
num_bytes: 986677202
num_examples: 51942
download_size: 653488377
dataset_size: 986677202
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
提供机构:
DjSteker
原始信息汇总
数据集概述
数据集信息
特征
- text: 数据类型为
null - inputs: 结构化数据
- 1-instruction: 数据类型为
string - 2-input: 数据类型为
string - 3-output: 数据类型为
string
- 1-instruction: 数据类型为
- prediction: 数据类型为
null - prediction_agent: 数据类型为
null - annotation: 数据类型为
string - annotation_agent: 数据类型为
string - vectors: 结构化数据
- input: 序列类型为
float64 - instruction: 序列类型为
float64 - output: 序列类型为
float64
- input: 序列类型为
- multi_label: 数据类型为
bool - explanation: 数据类型为
null - id: 数据类型为
string - metadata: 结构化数据
- bias_score.label: 数据类型为
string - bias_score.score: 数据类型为
float64 - en_index: 数据类型为
int64 - hate_score.label: 数据类型为
string - hate_score.score: 数据类型为
float64 - sf-multi-unprocessable-score: 数据类型为
float64 - sf-unprocessable-score: 数据类型为
float64 - tr-flag-1-instruction: 数据类型为
bool - tr-flag-2-input: 数据类型为
bool - tr-flag-3-output: 数据类型为
bool
- bias_score.label: 数据类型为
- status: 数据类型为
string - event_timestamp: 数据类型为
timestamp[us] - metrics: 结构化数据
- text_length: 数据类型为
int64
- text_length: 数据类型为
数据分割
- train:
- 字节数: 986677202
- 样本数: 51942
数据集大小
- 下载大小: 653488377
- 数据集大小: 986677202
配置
- default:
- 数据文件路径:
data/train-*
- 数据文件路径:



