PJMixers-Dev/ytz20_LMSYS-Chat-GPT-5-Chat-Response-ShareGPT
收藏Hugging Face2025-12-06 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/PJMixers-Dev/ytz20_LMSYS-Chat-GPT-5-Chat-Response-ShareGPT
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: category
dtype: string
- name: grounded
dtype: bool
- name: flaw
dtype: string
- name: agreement
dtype: bool
- name: conversations
list:
- name: from
dtype: string
- name: value
dtype: string
splits:
- name: train
num_bytes: 333333184
num_examples: 192014
download_size: 193099874
dataset_size: 333333184
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
```py
from datasets import load_dataset, Dataset
from tqdm import tqdm
import re
dataset = load_dataset("ytz20/LMSYS-Chat-GPT-5-Chat-Response")
new_dataset = []
for sample in tqdm(dataset["train"]):
match = re.search(
r"Below is an instruction that describes a task.\nWrite a response that appropriately completes the request.\n### Instruction:\n\s*(.*?)\s*\n### Response:\n",
sample["content"][1]["content"],
re.DOTALL,
)
if match:
new_dataset.append(
{
"id": sample["id"],
"category": sample["category"],
"grounded": sample["grounded"],
"flaw": sample["flaw"],
"agreement": sample["agreement"],
"conversations": [
{
"from": "human",
"value": match.group(1).strip(),
},
{
"from": "gpt",
"value": sample["teacher_response"].strip(),
},
],
}
)
new_dataset = Dataset.from_list(new_dataset)
new_dataset = new_dataset.shuffle(seed=42)
new_dataset.push_to_hub(
"PJMixers-Dev/ytz20_LMSYS-Chat-GPT-5-Chat-Response-ShareGPT",
num_shards=1,
private=False,
)
```
提供机构:
PJMixers-Dev



