maartensap/ToxicityPrompts
收藏Hugging Face2024-05-07 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/maartensap/ToxicityPrompts
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: ptp-de
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 202660324
num_examples: 25000
- name: small
num_bytes: 40535712
num_examples: 5000
download_size: 139277097
dataset_size: 243196036
- config_name: ptp-en
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: pile_set_name
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 148678660
num_examples: 25000
- name: small
num_bytes: 29175571
num_examples: 5000
download_size: 94883333
dataset_size: 177854231
- config_name: ptp-es
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: pile_set_name
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 208900890
num_examples: 25000
- name: small
num_bytes: 42751527
num_examples: 5000
download_size: 144897319
dataset_size: 251652417
- config_name: ptp-fr
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: pile_set_name
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 134923345
num_examples: 25000
- name: small
num_bytes: 26919093
num_examples: 5000
download_size: 83396202
dataset_size: 161842438
- config_name: ptp-hi
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 442698792
num_examples: 25000
- name: small
num_bytes: 88719102
num_examples: 5000
download_size: 218202314
dataset_size: 531417894
- config_name: ptp-id
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: pile_set_name
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 154608924
num_examples: 25000
- name: small
num_bytes: 31481713
num_examples: 5000
download_size: 92762065
dataset_size: 186090637
- config_name: ptp-it
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: pile_set_name
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 159696494
num_examples: 25000
- name: small
num_bytes: 30569163
num_examples: 5000
download_size: 105951666
dataset_size: 190265657
- config_name: ptp-ja
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 537800495
num_examples: 25000
- name: small
num_bytes: 108629073
num_examples: 5000
download_size: 323201834
dataset_size: 646429568
- config_name: ptp-ko
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: pile_set_name
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 291242641
num_examples: 25000
- name: small
num_bytes: 51527340
num_examples: 5000
download_size: 167777973
dataset_size: 342769981
- config_name: ptp-zh
features:
- name: text
dtype: string
- name: meta_data
struct:
- name: data_category
dtype: string
- name: lang
dtype: string
- name: timestamp
dtype: string
- name: url
dtype: string
- name: perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity
dtype: float64
- name: toxicity_bucket
dtype: float64
- name: prompt
dtype: string
- name: continuation
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: continuation_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
splits:
- name: full
num_bytes: 118090292
num_examples: 25000
- name: small
num_bytes: 23406534
num_examples: 5000
download_size: 66624961
dataset_size: 141496826
- config_name: wildchat-de
features:
- name: index
dtype: float64
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 2133022
num_examples: 1000
download_size: 998038
dataset_size: 2133022
- config_name: wildchat-en
features:
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 1089060
num_examples: 1000
download_size: 496179
dataset_size: 1089060
- config_name: wildchat-es
features:
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 816804
num_examples: 1000
download_size: 309448
dataset_size: 816804
- config_name: wildchat-fr
features:
- name: index
dtype: float64
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 900990
num_examples: 1000
download_size: 370191
dataset_size: 900990
- config_name: wildchat-hi
features:
- name: index
dtype: float64
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 1102065
num_examples: 1000
download_size: 357991
dataset_size: 1102065
- config_name: wildchat-id
features:
- name: index
dtype: float64
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 746462
num_examples: 1000
download_size: 271844
dataset_size: 746462
- config_name: wildchat-it
features:
- name: index
dtype: float64
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 1943698
num_examples: 1000
download_size: 816781
dataset_size: 1943698
- config_name: wildchat-ja
features:
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 1347147
num_examples: 1000
download_size: 684773
dataset_size: 1347147
- config_name: wildchat-ko
features:
- name: index
dtype: float64
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 989429
num_examples: 1000
download_size: 383613
dataset_size: 989429
- config_name: wildchat-zh
features:
- name: prompt
dtype: string
- name: prompt_perspective
struct:
- name: attributeScores
struct:
- name: IDENTITY_ATTACK
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: INSULT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: PROFANITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: SEVERE_TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: THREAT
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: TOXICITY
struct:
- name: spanScores
list:
- name: begin
dtype: int64
- name: end
dtype: int64
- name: score
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: summaryScore
struct:
- name: type
dtype: string
- name: value
dtype: float64
- name: detectedLanguages
sequence: string
- name: languages
sequence: string
- name: toxicity_bucket
dtype: float64
- name: meta_data
struct:
- name: data_category
dtype: string
- name: id
dtype: string
- name: prompt_toxicity
dtype: float64
splits:
- name: wildchat
num_bytes: 648763
num_examples: 1000
download_size: 246903
dataset_size: 648763
configs:
- config_name: ptp-de
data_files:
- split: full
path: ptp-de/full-*
- split: small
path: ptp-de/small-*
- config_name: ptp-en
data_files:
- split: full
path: ptp-en/full-*
- split: small
path: ptp-en/small-*
- config_name: ptp-es
data_files:
- split: full
path: ptp-es/full-*
- split: small
path: ptp-es/small-*
- config_name: ptp-fr
data_files:
- split: full
path: ptp-fr/full-*
- split: small
path: ptp-fr/small-*
- config_name: ptp-hi
data_files:
- split: full
path: ptp-hi/full-*
- split: small
path: ptp-hi/small-*
- config_name: ptp-id
data_files:
- split: full
path: ptp-id/full-*
- split: small
path: ptp-id/small-*
- config_name: ptp-it
data_files:
- split: full
path: ptp-it/full-*
- split: small
path: ptp-it/small-*
- config_name: ptp-ja
data_files:
- split: full
path: ptp-ja/full-*
- split: small
path: ptp-ja/small-*
- config_name: ptp-ko
data_files:
- split: full
path: ptp-ko/full-*
- split: small
path: ptp-ko/small-*
- config_name: ptp-zh
data_files:
- split: full
path: ptp-zh/full-*
- split: small
path: ptp-zh/small-*
- config_name: wildchat-de
data_files:
- split: wildchat
path: wildchat-de/wildchat-*
- config_name: wildchat-en
data_files:
- split: wildchat
path: wildchat-en/wildchat-*
- config_name: wildchat-es
data_files:
- split: wildchat
path: wildchat-es/wildchat-*
- config_name: wildchat-fr
data_files:
- split: wildchat
path: wildchat-fr/wildchat-*
- config_name: wildchat-hi
data_files:
- split: wildchat
path: wildchat-hi/wildchat-*
- config_name: wildchat-id
data_files:
- split: wildchat
path: wildchat-id/wildchat-*
- config_name: wildchat-it
data_files:
- split: wildchat
path: wildchat-it/wildchat-*
- config_name: wildchat-ja
data_files:
- split: wildchat
path: wildchat-ja/wildchat-*
- config_name: wildchat-ko
data_files:
- split: wildchat
path: wildchat-ko/wildchat-*
- config_name: wildchat-zh
data_files:
- split: wildchat
path: wildchat-zh/wildchat-*
---
提供机构:
maartensap
原始信息汇总
数据集概述
数据集配置名称
- ptp-de
- ptp-en
- ptp-es
主要特征
1. ptp-de 特征:
- text: 数据类型为 string。
- meta_data: 结构化数据,包含以下字段:
- data_category: 数据类型为 string。
- lang: 数据类型为 string。
- timestamp: 数据类型为 string。
- url: 数据类型为 string。
- perspective: 结构化数据,包含以下字段:
- attributeScores: 结构化数据,包含多个评分属性,如 IDENTITY_ATTACK, INSULT, PROFANITY, SEVERE_TOXICITY, THREAT, TOXICITY,每个属性包含 spanScores 和 summaryScore。
- toxicity: 数据类型为 float64。
- toxicity_bucket: 数据类型为 float64。
- prompt: 数据类型为 string。
- continuation: 数据类型为 string。
- prompt_perspective 和 continuation_perspective: 结构化数据,类似于 perspective。
2. ptp-en 特征:
- text: 数据类型为 string。
- meta_data: 结构化数据,包含以下字段:
- data_category: 数据类型为 string。
- lang: 数据类型为 string。
- pile_set_name: 数据类型为 string。
- timestamp: 数据类型为 string。
- url: 数据类型为 string。
- perspective: 结构化数据,包含以下字段:
- attributeScores: 结构化数据,包含多个评分属性,如 IDENTITY_ATTACK, INSULT, PROFANITY, SEVERE_TOXICITY, THREAT, TOXICITY,每个属性包含 spanScores 和 summaryScore。
- toxicity: 数据类型为 float64。
- toxicity_bucket: 数据类型为 float64。
- prompt: 数据类型为 string。
- continuation: 数据类型为 string。
- prompt_perspective 和 continuation_perspective: 结构化数据,类似于 perspective。
3. ptp-es 特征:
- text: 数据类型为 string。
- meta_data: 结构化数据,包含以下字段:
- data_category: 数据类型为 string。
- lang: 数据类型为 string。
- pile_set_name: 数据类型为 string。
- timestamp: 数据类型为 string。
- url: 数据类型为 string。
- perspective: 结构化数据,包含以下字段:
- attributeScores: 结构化数据,包含多个评分属性,如 IDENTITY_ATTACK, INSULT, PROFANITY, SEVERE_TOXICITY, THREAT, TOXICITY,每个属性包含 spanScores 和 summaryScore。
数据集分割
ptp-de 分割:
- full: 25000 个示例,总字节数为 202660324。
- small: 5000 个示例,总字节数为 40535712。
ptp-en 分割:
- full: 25000 个示例,总字节数为 148678660。
- small: 5000 个示例,总字节数为 29175571。
ptp-es 分割:
- 信息不完整,无法提供详细分割数据。



