hsiung/llm-similarity-risk
收藏Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/hsiung/llm-similarity-risk
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: prompt
dtype: string
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: train
num_bytes: 355835764
num_examples: 59766
- name: test
num_bytes: 39616427
num_examples: 8304
- name: train_random_5k
num_bytes: 352570387
num_examples: 57000
- name: train_random_1k
num_bytes: 347865006
num_examples: 53000
- name: train_pure_bad_high_sim_5k
num_bytes: 352282888
num_examples: 57000
- name: train_pure_bad_low_sim_5k
num_bytes: 352908141
num_examples: 57000
- name: train_pure_bad_high_sim_1k
num_bytes: 347727585
num_examples: 53000
- name: train_pure_bad_low_sim_1k
num_bytes: 348070253
num_examples: 53000
- name: train_list_high_sim_5k
num_bytes: 352588208
num_examples: 57000
- name: train_list_low_sim_5k
num_bytes: 352538289
num_examples: 57000
- name: train_list_high_sim_1k
num_bytes: 347857708
num_examples: 53000
- name: train_list_low_sim_1k
num_bytes: 347998423
num_examples: 53000
- name: train_samsum_high_sim_5k
num_bytes: 351978835
num_examples: 57000
- name: train_samsum_low_sim_5k
num_bytes: 353148699
num_examples: 57000
- name: train_samsum_high_sim_1k
num_bytes: 347601579
num_examples: 53000
- name: train_samsum_low_sim_1k
num_bytes: 348287170
num_examples: 53000
- name: train_alpaca_high_sim_5k
num_bytes: 352417167
num_examples: 57000
- name: train_alpaca_low_sim_5k
num_bytes: 352734748
num_examples: 57000
- name: train_alpaca_high_sim_1k
num_bytes: 347761783
num_examples: 53000
- name: train_alpaca_low_sim_1k
num_bytes: 348039401
num_examples: 53000
download_size: 6355867991
dataset_size: 6699828461
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
- split: train_random_5k
path: data/train_random_5k-*
- split: train_random_1k
path: data/train_random_1k-*
- split: train_pure_bad_high_sim_5k
path: data/train_pure_bad_high_sim_5k-*
- split: train_pure_bad_low_sim_5k
path: data/train_pure_bad_low_sim_5k-*
- split: train_pure_bad_high_sim_1k
path: data/train_pure_bad_high_sim_1k-*
- split: train_pure_bad_low_sim_1k
path: data/train_pure_bad_low_sim_1k-*
- split: train_list_high_sim_5k
path: data/train_list_high_sim_5k-*
- split: train_list_low_sim_5k
path: data/train_list_low_sim_5k-*
- split: train_list_high_sim_1k
path: data/train_list_high_sim_1k-*
- split: train_list_low_sim_1k
path: data/train_list_low_sim_1k-*
- split: train_samsum_high_sim_5k
path: data/train_samsum_high_sim_5k-*
- split: train_samsum_low_sim_5k
path: data/train_samsum_low_sim_5k-*
- split: train_samsum_high_sim_1k
path: data/train_samsum_high_sim_1k-*
- split: train_samsum_low_sim_1k
path: data/train_samsum_low_sim_1k-*
- split: train_alpaca_high_sim_5k
path: data/train_alpaca_high_sim_5k-*
- split: train_alpaca_low_sim_5k
path: data/train_alpaca_low_sim_5k-*
- split: train_alpaca_high_sim_1k
path: data/train_alpaca_high_sim_1k-*
- split: train_alpaca_low_sim_1k
path: data/train_alpaca_low_sim_1k-*
---
提供机构:
hsiung



