Reverse-Text-SFT
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/PrimeIntellect/Reverse-Text-SFT
下载链接
链接失效反馈官方服务:
资源简介:
# Reverse-Text-SFT
<!-- Provide a quick summary of the dataset. -->
A small, scrappy SFT dataset used for warming up a small model (e.g. `Qwen/Qwen3-0.6B`) for RL training. Contains examples in `prompt`-`completion` chat format of reversing 5-20 words of text character-by-character. The raw sentences were processed from `willcb/R1-reverse-wikipedia-paragraphs-v1-1000`.
The following script was used to generate the dataset.
```python
from datasets import Dataset, load_dataset
dataset = load_dataset("willcb/R1-reverse-wikipedia-paragraphs-v1-1000", split="train")
prompt = "Reverse the text character-by-character. Put your answer in <reversed_text> tags."
sentences_list = dataset.map(lambda example: {"sentences": [s for s in example["prompt"][1]["content"].split(". ") if 5 <= len(s.split(" ")) <= 20]})["sentences"]
sentences = [sentence for sentences in sentences_list for sentence in sentences] # Flatten
completions = [s[::-1] for s in sentences] # Reverse to get ground truth
examples = []
for sentence, completion in zip(sentences, completions):
examples.append({"prompt": [{"content": prompt, "role": "system"}, {"content": sentence, "role": "user"}], "completion": [{"content": f"<reversed_text>{completion}</reversed_text>", "role": "assistant"}]})
small_sft = Dataset.from_list(examples).select(range(1000))
```
# 反向文本监督微调(Reverse-Text-SFT)
<!-- 请提供数据集的简要摘要。 -->
本数据集为一款精简实用的监督微调(Supervised Fine-Tuning,SFT)数据集,用于为强化学习(Reinforcement Learning,RL)训练预热小型模型(例如`Qwen/Qwen3-0.6B`)。数据集采用`prompt`-`completion`对话格式,涵盖对5至20词文本逐字符反转的训练示例,其原始语句源自`willcb/R1-reverse-wikipedia-paragraphs-v1-1000`数据集。
可通过以下脚本生成本数据集:
python
from datasets import Dataset, load_dataset
dataset = load_dataset("willcb/R1-reverse-wikipedia-paragraphs-v1-1000", split="train")
prompt = "Reverse the text character-by-character. Put your answer in <reversed_text> tags."
sentences_list = dataset.map(lambda example: {"sentences": [s for s in example["prompt"][1]["content"].split(". ") if 5 <= len(s.split(" ")) <= 20]})["sentences"]
sentences = [sentence for sentences in sentences_list for sentence in sentences] # 展平列表
completions = [s[::-1] for s in sentences] # 反转文本以获取基准真值
examples = []
for sentence, completion in zip(sentences, completions):
examples.append({"prompt": [{"content": prompt, "role": "system"}, {"content": sentence, "role": "user"}], "completion": [{"content": f"<reversed_text>{completion}</reversed_text>", "role": "assistant"}]})
small_sft = Dataset.from_list(examples).select(range(1000))
提供机构:
maas
创建时间:
2025-08-13



