georgiyozhegov/stories
收藏Hugging Face2024-12-11 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/georgiyozhegov/stories
下载链接
链接失效反馈官方服务:
资源简介:
Stories是一个合成的文本数据集,包含生成和增强的文本数据。该数据集的目的是教授语言模型在俄语和英语中的基本写作技能。每个原始文本都是使用OpenAI的gpt-4o-mini模型生成的。每个样本都被翻译成英语,然后再翻译回俄语,从而使原始数据量增加了三倍。
Stories is a synthetic dataset that consists of generated & augmented text data. The goal of this dataset is to teach LM basic writing skills in both Russian and English. Each original text was generated using `gpt-4o-mini` model by OpenAI. Each sample was translated to English, and translated back to Russian. Thus, the size of the original data increased by 3 times.
提供机构:
georgiyozhegov



