five

RLHFlow/CodeUltraFeedback-standard

收藏
Hugging Face2024-04-27 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/RLHFlow/CodeUltraFeedback-standard
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: rejected list: - name: content dtype: string - name: role dtype: string - name: chosen list: - name: content dtype: string - name: role dtype: string - name: chosen_score dtype: string - name: rejected_score dtype: string splits: - name: train num_bytes: 245682332 num_examples: 50156 download_size: 47661064 dataset_size: 245682332 configs: - config_name: default data_files: - split: train path: data/train-* --- The original dataset is coseal/CodeUltraFeedback. We provide the processing script as follows. In summarization, for each prompt, we include all possible comparisons, except those with the same rating. ```python ds = load_dataset("coseal/CodeUltraFeedback", split="train") import itertools data = [] for example in ds: prompt = example['instruction'] responses = {} for row in example['responses']: model_name, response = row['model'], row['response'] responses[model_name] = response annotations = {} for row in example['annotations']: model_name, rating = row['model'], row['rating'] annotations[model_name] = rating preference = 'code-' + example['preference'] all_models = [] for model_name in responses.keys(): if rating == 'N/A': continue all_models.append(model_name) all_combinations = list(itertools.combinations(all_models, 2)) assert len(all_combinations) == len(all_models) * (len(all_models) - 1) / 2 for combination in all_combinations: response1 = responses[combination[0]] rating1 = annotations[combination[0]] response2 = responses[combination[1]] rating2 = annotations[combination[1]] if rating1 == rating2: continue if rating1 > rating2: chosen_message = [ {"content": prompt, "role": "user"}, {"content": response1, "role": "assistant"}, ] rejected_message = [ {"content": prompt, "role": "user"}, {"content": response2, "role": "assistant"}, ] chosen_rating = rating1 rejected_rating = rating2 elif rating1 < rating2: chosen_message = [ {"content": prompt, "role": "user"}, {"content": response2, "role": "assistant"}, ] rejected_message = [ {"content": prompt, "role": "user"}, {"content": response1, "role": "assistant"}, ] chosen_rating = rating2 rejected_rating = rating1 else: print("error") data.append({"rejected": rejected_message, "chosen": chosen_message, "rejected_score": rejected_rating, "chosen_score": chosen_rating}) ```
提供机构:
RLHFlow
原始信息汇总

数据集概述

数据集特征

  • rejected
    • content: 数据类型为字符串
    • role: 数据类型为字符串
  • chosen
    • content: 数据类型为字符串
    • role: 数据类型为字符串
  • chosen_score: 数据类型为字符串
  • rejected_score: 数据类型为字符串

数据集分割

  • train
    • num_bytes: 245682332
    • num_examples: 50156

数据集大小

  • download_size: 47661064
  • dataset_size: 245682332

配置

  • config_name: default
    • data_files
      • split: train
        • path: data/train-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作