XiaChuFang Recipe Corpus
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/XiaChuFang_Recipe_Corpus
下载链接
链接失效反馈官方服务:
资源简介:
完整的食谱语料库包含1,520,327种中国食谱。其中,1,242,206食谱属于30,060菜肴。一道菜平均有41.3个食谱。食谱的平均长度是224个字符。最大长度为62,722个字符,最小长度为10个字符。食谱由415,272位作者贡献。其中,最有生产力的作者上传5,394食谱。我们提供脱敏的作者信息。
The complete recipe corpus contains 1,520,327 Chinese recipes. Of these, 1,242,206 recipes belong to 30,060 distinct dishes. Each dish has an average of 41.3 corresponding recipes. The average length of a single recipe is 224 characters, with the maximum length reaching 62,722 characters and the minimum being 10 characters. These recipes were contributed by 415,272 authors, among whom the most prolific contributor submitted 5,394 recipes. We provide de-identified author information.
提供机构:
OpenDataLab
创建时间:
2022-11-18
搜集汇总
数据集介绍

背景与挑战
背景概述
XiaChuFang Recipe Corpus是一个包含152万种中国食谱的大规模语料库,涵盖3万多种菜肴,平均每个食谱224个字符。该数据集由清华大学、百度和北京通用人工智能研究院联合发布,提供了丰富的食谱文本数据用于研究。
以上内容由遇见数据集搜集并总结生成



