five

Peutlefaire/rephrased-wildjailbreak

收藏
Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Peutlefaire/rephrased-wildjailbreak
下载链接
链接失效反馈
官方服务:
资源简介:
Rephrased version of [allenai/wildjailbreak](https://huggingface.co/datasets/allenai/wildjailbreak) where, for each sample, 5 different rephrasing are created using [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct). The generation script used is as follows: ```python import torch from tqdm import tqdm from datasets import load_dataset, Dataset from vllm import LLM, SamplingParams PROMPT_TEMPLATE = """ You act as a rephraser. Given an input prompt, your task is to generate 5 different rephrasings of the same prompt while preserving its original meaning. You only output the rephrasings without any additional text. For example: Request: "What's the capital of France?" Rephrasing: "Can you tell me the capital city of France?" Rephrasing: "I would like to know the capital of France." Rephrasing: "Could you please provide the name of France's capital?" Rephrasing: "What city serves as the capital of France?" Rephrasing: "Please tell me which city is the capital of France." Make sure you format your output exactly as in the example above and never deviate from it. Now, here's the request you need to rephrase: Request: "{input_prompt}" """ def main(): # Program parameters effective_batch_size = 16 # Reference dataset unsafe_dataset = load_dataset( "allenai/wildjailbreak", name="eval", split="train" ).filter(lambda x: x["label"] == 1) # Loading the model sampling_params = SamplingParams(max_tokens=4096, temperature=0.0) llm = LLM( "Qwen/Qwen2.5-7B-Instruct", gpu_memory_utilization=0.95, max_model_len=8192, max_num_seqs=effective_batch_size, trust_remote_code=True, tensor_parallel_size=torch.cuda.device_count(), ) tokenizer = llm.get_tokenizer() # Running inference and storing generated all_rephrasings = [] for i in tqdm(range(0, len(unsafe_dataset), effective_batch_size)): # Getting texts messages = [ [ { "role": "user", "content": PROMPT_TEMPLATE.format( input_prompt=unsafe_dataset[j]["adversarial"] ), } ] for j in range(i, min(i + effective_batch_size, len(unsafe_dataset))) ] texts = [ tokenizer.apply_chat_template(m, tokenize=False, add_generation_prompt=True) for m in messages ] # Generating outputs outputs = llm.generate(texts, sampling_params=sampling_params) # Parsing outputs for output in outputs: generated_text = output.outputs[0].text lines = generated_text.strip().split("\n") for line in lines: if line.startswith("Rephrasing:"): rephrasings = line.replace("Rephrasing:", "").strip().strip('"') all_rephrasings.append(rephrasings) # Creating dataset with single column dataset = Dataset.from_dict({"prompt": all_rephrasings}) # Uploading to Hugging Face Hub as private dataset dataset.push_to_hub( "Peutlefaire/rephrased-wildjailbreak", # Replace with your HF username private=True, ) print( f"Successfully uploaded {len(all_rephrasings)} rephrased prompts to Hugging Face Hub" ) if __name__ == "__main__": main() ``` --- dataset_info: features: - name: prompt dtype: string splits: - name: train num_bytes: 4258602 num_examples: 8767 download_size: 1275205 dataset_size: 4258602 configs: - config_name: default data_files: - split: train path: data/train-* ---
提供机构:
Peutlefaire
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作