styfeng/TinyDialogues
收藏Hugging Face2024-11-26 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/styfeng/TinyDialogues
下载链接
链接失效反馈官方服务:
资源简介:
TinyDialogues数据集是为研究儿童导向的语音是否有效作为语言模型的训练数据而创建的。该数据集包含约130k条由GPT-4生成的儿童导向对话,这些对话根据儿童的年龄、类型、参与者、长度和内容有所不同。数据集的结构包括按年龄升序排列的训练和验证数据,以及包含每个年龄所有示例的元数据文件。数据集的语言为英语,遵循MIT许可证。
The TinyDialogues dataset was created to investigate whether child-directed speech is effective as training data for language models. It contains approximately 130k child-directed conversations synthesized by GPT-4, varying by child age, type, participants, length, and content. The dataset structure includes training and validation data ordered ascending by age, along with metadata files containing all examples for each age. The dataset is in English and is licensed under MIT.
提供机构:
styfeng



