akshayg08/sherlock_preference_dataset
收藏Hugging Face2025-02-01 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/akshayg08/sherlock_preference_dataset
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了用于调整Sherlock数据集上视觉语言模型的偏好数据,旨在评估使用监督微调或偏好优化的有效性。数据集中的偏好是通过四个模型生成的,包括mistralai/Pixtral-12B-2409, Qwen/Qwen2-VL-7B-Instruct, google/paligemma2-3b-ft-docci-448和google/paligemma2-10b-ft-docci-448。偏好数据是为了优化PaLI-Gemma模型而设计的,使用在策略生成的数据,并通过CLIP分数比较构建偏好。由于数据是合成生成的,可能含有噪声,因此在优化过程中建议使用标签平滑。
This dataset contains preference data for tuning Vision-Language models on the Sherlock Dataset for Abductive Reasoning, designed to evaluate the effectiveness of fine-tuning using Supervised Fine-Tuning (SFT) or Preference Optimization. The preferences are generated by four models: mistralai/Pixtral-12B-2409, Qwen/Qwen2-VL-7B-Instruct, google/paligemma2-3b-ft-docci-448, and google/paligemma2-10b-ft-docci-448. The preference data is designed for optimizing PaLI-Gemma models, using on-policy generated data and constructing preferences by comparing CLIP scores. As the data is synthetically generated and may contain noise, label smoothing is recommended during the optimization process.
提供机构:
akshayg08



