five

Multimodal Ingredient Substitution

收藏
www.kaggle.com2024-11-08 更新2025-01-08 收录
下载链接:
https://www.kaggle.com/kanakraj/multimodal-ingredient-substitution
下载链接
链接失效反馈
官方服务:
资源简介:
# Multimodal Ingredient Substitution Knowledge Graph for Personalized Dietary Recommendations (MISKG) Please refer to the [Github repo](https://github.com/kanak8278/MISKG/) for more details and exact [license information](https://github.com/kanak8278/MISKG/blob/main/LICENSE.md). ### Abstract Ingredient substitution is essential in adapting recipes to meet individual dietary needs, preferences, and ingredient availability. We introduce a Multimodal Ingredient Substitution Knowledge Graph (MISKG) that captures a comprehensive and contextual understanding of 16,077 ingredients and 80,110 substitution pairs. The KG integrates semantic, nutritional, and flavor data, allowing both text and image-based querying for ingredient substitutions. Utilizing various sources such as ConceptNet, Wikidata, Edamam, and FlavorDB, this dataset supports personalized recipe adjustments based on dietary constraints, health labels, and sensory preferences. This work addresses gaps in existing datasets by including visual representations, nutrient information, contextual ingredient relationships, providing a valuable resource for culinary research and digital gastronomy. ## File Descriptions and Purposes 1. **competition/original_ingredients_with_id.csv** - Purpose: Provides a list of original, unprocessed ingredient names. - Columns: id (original_id), original - Example: 59e6c716,ababai 2. **competition/processed_ingredients_with_id.csv** - Purpose: Offers a cleaned, standardized list of ingredient names. - Columns: id (processed_id), processed - Example: d5a8268e,ababai 3. **competition/original_to_processed_mapping.csv** - Purpose: Maps original ingredients to their processed counterparts. - Columns: original_id, processed_id, original, processed - Example: 59e6c716,d5a8268e,ababai,ababai - Note: Some original ingredients may map to an empty string in processed form. 4. **competition/edamam.json** - Purpose: Provides nutritional information for ingredients. - Key: processed_id - Example fields: ingredient_name, food_id, category, nutrients, label, weight, uri - Example: ```json "b8b2e121": { "ingredient_name": "fish sauce", "food_id": "food_ahlu6u3ab8bu1wap7cbqua3s1quk", "category": "Generic foods", "nutrients": { "ENERC_KCAL": 35.0, "PROCNT": 5.06, "FAT": 0.01, "CHOCDF": 3.64, "FIBTG": 0.0 }, "label": "Serving", "weight": "5.0", "uri": "http://www.edamam.com/ontologies/edamam.owl#Measure_serving" } ``` 5. **competition/ingredient_to_flavordb.json** - Purpose: Maps ingredients to flavor profiles from FlavorDB. - Key: processed_id - Example fields: ingredient, flavordb_id, flavordb, cosine_similarity - Example: ```json "a5bd8077": { "ingredient": "abalone", "flavordb_id": 38, "flavordb": "abalone", "cosine_similarity": 1.0 } ``` 6. **competition/processed_id_recipe1m_map.json** - Purpose: Links ingredients to recipes in the Recipe1M dataset. - Key: processed_id - Example fields: name, original_id, recipe_ids (list of recipe IDs using this ingredient) - Example: ```json "76b8b630": { "name": "penne", "original_id": "558b7f02", "recipe_ids": ["000018c8a5", "006a7c00c4", "00ab15a16a", ...] } ``` 7. **competition/substitution_pairs.json** - Purpose: Defines substitution relationships between ingredients. - Fields: ingredient, substitution, ingredient_original_id, substitution_original_id, ingredient_processed_id, substitution_processed_id - Example: ```json { "ingredient": "arrowroot", "substitution": "flour", "ingredient_original_id": "7c369462", "substitution_original_id": "27450ef2", "ingredient_processed_id": "f41030c9", "substitution_processed_id": "4cf0bf0f" } ``` 8. **competition/wikidata_data.json** - Purpose: Offers additional encyclopedic information about ingredients. - Key: processed_id - Example fields: type, id, labels, descriptions, claims - Example: (Abbreviated for brevity) ```json "d5a8268e": { "type": "item", "id": "Q31780235", "labels": { "en": {"language": "en", "value": "Abābāi"} }, "descriptions": { "en": {"language": "en", "value": "mountain in Pakistan"} }, "claims": { ... } } ``` 9. **competition/conceptnet.json** - Purpose: Provides semantic relationships and common-sense knowledge about ingredients. - Key: processed_id - Example fields: @context, @id, edges (list of related concepts and relationships) - Example: (Abbreviated for brevity) ```json "a5bd8077": { "@context": ["http://api.conceptnet.io/ld/conceptnet5.7/context.ld.json"], "@id": "/c/en/abalone", "edges": [ { "@id": "/a/[/r/Synonym/,/c/fr/haliotis/n/wn/animal/,/c/en/abalone/n/wn/animal/]", "rel": {"@id": "/r/Synonym", "label": "Synonym"}, "start": {"@id": "/c/fr/haliotis/n/wn/animal", "label": "Haliotis"}, "end": {"@id": "/c/en/abalone/n/wn/animal", "label": "abalone"} }, ... ] } ``` ## Connecting the Data To fully utilize this dataset, you'll need to connect information across multiple files. Here's a step-by-step guide: 1. Start with an ingredient (either original or processed): - If starting with an original ingredient, use `original_ingredients_with_id.csv` to find its original_id. - Use `original_to_processed_mapping.csv` to find the corresponding processed_id. 2. With the processed_id, you can now access: - Nutritional information from `edamam.json` - Flavor profiles from `ingredient_to_flavordb.json` - Recipe connections from `processed_id_recipe1m_map.json` - Encyclopedic information from `wikidata_data.json` - Semantic relationships from `conceptnet.json` 3. For substitutions: - Use `substitution_pairs.json` to find potential substitutes for your ingredient. - Each substitute will have its own processed_id, which you can use to gather its information following steps 1-2. ## Complete Data Relationship Diagram ![datasetgraph](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4503546%2F3b7552c87036edd3b3e92b39e20eaf8d%2Fimage.png?generation=1730832146689152&alt=media) ## License This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License - see the [LICENSE & TERMS OF USE](https://github.com/kanak8278/MISKG/blob/main/LICENSE.md) for details.

多模态成分替代知识图谱:个性化饮食推荐(Multimodal Ingredient Substitution Knowledge Graph for Personalized Dietary Recommendations) 请参阅[GitHub仓库](https://github.com/kanak8278/MISKG/)以获取更多详细信息及精确的[许可信息](https://github.com/kanak8278/MISKG/blob/main/LICENSE.md)。 ### 摘要 成分替代对于适应个人饮食需求、偏好以及成分可用性至关重要。我们引入了一种多模态成分替代知识图谱(MISKG),该图谱全面且语境化地捕捉了16,077种成分及其80,110个替代对。该知识图谱整合了语义、营养和风味数据,允许基于文本和图像进行成分替代查询。利用ConceptNet、Wikidata、Edamam和FlavorDB等多元来源,该数据集支持基于饮食限制、健康标签和感官偏好的个性化食谱调整。本工作通过包含视觉表示、营养成分信息、语境化成分关系等,填补了现有数据集的空白,为烹饪研究和数字美食学提供了一个宝贵的资源。 ## 文件描述及用途 1. **competition/original_ingredients_with_id.csv** - 用途:提供原始、未经处理的成分名称列表。 - 列:id(original_id)、original - 示例:59e6c716,ababai 2. **competition/processed_ingredients_with_id.csv** - 用途:提供清洗、标准化的成分名称列表。 - 列:id(processed_id)、processed - 示例:d5a8268e,ababai 3. **competition/original_to_processed_mapping.csv** - 用途:将原始成分映射到其处理后的对应物。 - 列:original_id、processed_id、original、processed - 示例:59e6c716,d5a8268e,ababai,ababai - 备注:某些原始成分在处理形式中可能映射到空字符串。 4. **competition/edamam.** - 用途:提供成分的营养信息。 - 键:processed_id - 示例字段:ingredient_name、food_id、category、nutrients、label、weight、uri - 示例: "b8b2e121": { "ingredient_name": "fish sauce", "food_id": "food_ahlu6u3ab8bu1wap7cbqua3s1quk", "category": "Generic foods", "nutrients": { "ENERC_KCAL": 35.0, "PROCNT": 5.06, "FAT": 0.01, "CHOCDF": 3.64, "FIBTG": 0.0 }, "label": "Serving", "weight": "5.0", "uri": "http://www.edamam.com/ontologies/edamam.owl#Measure_serving" } 5. **competition/ingredient_to_flavordb.** - 用途:将成分映射到FlavorDB中的风味特征。 - 键:processed_id - 示例字段:ingredient、flavordb_id、flavordb、cosine_similarity - 示例: "a5bd8077": { "ingredient": "abalone", "flavordb_id": 38, "flavordb": "abalone", "cosine_similarity": 1.0 } 6. **competition/processed_id_recipe1m_map.** - 用途:将成分与Recipe1M数据集中的食谱关联起来。 - 键:processed_id - 示例字段:name、original_id、recipe_ids(使用此成分的食谱ID列表) - 示例: "76b8b630": { "name": "penne", "original_id": "558b7f02", "recipe_ids": ["000018c8a5", "006a7c00c4", "00ab15a16a", ...] } 7. **competition/substitution_pairs.** - 用途:定义成分之间的替代关系。 - 字段:ingredient、substitution、ingredient_original_id、substitution_original_id、ingredient_processed_id、substitution_processed_id - 示例: { "ingredient": "arrowroot", "substitution": "flour", "ingredient_original_id": "7c369462", "substitution_original_id": "27450ef2", "ingredient_processed_id": "f41030c9", "substitution_processed_id": "4cf0bf0f" } 8. **competition/wikidata_data.** - 用途:提供关于成分的额外百科信息。 - 键:processed_id - 示例字段:type、id、labels、descriptions、claims - 示例: "d5a8268e": { "type": "item", "id": "Q31780235", "labels": { "en": { "language": "en", "value": "Abābāi" } }, "descriptions": { "en": { "language": "en", "value": "mountain in Pakistan" } }, "claims": { ... } } 9. **competition/conceptnet.** - 用途:提供关于成分的语义关系和常识知识。 - 键:processed_id - 示例字段:@context、@id、edges(相关概念和关系的列表) - 示例: "a5bd8077": { "@context": ["http://api.conceptnet.io/ld/conceptnet5.7/context.ld."], "@id": "/c/en/abalone", "edges": [ { "@id": "/a/[/r/Synonym/,/c/fr/haliotis/n/wn/animal/,/c/en/abalone/n/wn/animal/]", "rel": { "@id": "/r/Synonym", "label": "Synonym" }, "start": { "@id": "/c/fr/haliotis/n/wn/animal", "label": "Haliotis" }, "end": { "@id": "/c/en/abalone/n/wn/animal", "label": "abalone" } }, ... ] } ## 数据连接 为了充分利用此数据集,您需要连接多个文件中的信息。以下为逐步指南: 1. 从一个成分(原始或处理后的)开始: - 如果从原始成分开始,使用`original_ingredients_with_id.csv`以找到其original_id。 - 使用`original_to_processed_mapping.csv`以找到相应的processed_id。 2. 使用processed_id,现在可以访问: - 来自`edamam.`的营养信息 - 来自`ingredient_to_flavordb.`的风味特征 - 来自`processed_id_recipe1m_map.`的食谱连接 - 来自`wikidata_data.`的百科信息 - 来自`conceptnet.`的语义关系 3. 对于替代品: - 使用`substitution_pairs.`以找到成分的潜在替代品。 - 每个替代品都将有自己的processed_id,您可以使用它按照步骤1-2收集其信息。 ## 完整数据关系图 ![datasetgraph](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4503546%2F3b7552c87036edd3b3e92b39e20eaf8d%2Fimage.png?generation=1730832146689152&alt=media) ## 许可证 本项目采用Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)许可 - 请参阅[LICENSE & TERMS OF USE](https://github.com/kanak8278/MISKG/blob/main/LICENSE.md)以获取详细信息。
提供机构:
Kaggle
搜集汇总
背景与挑战
背景概述
该数据集是一个多模态食材替代知识图谱(MISKG),包含16,077种食材和80,110个替代对,整合了语义、营养和风味等多源数据,支持基于文本和图像的查询。它旨在通过提供食材的上下文关系、营养信息和视觉表示,帮助用户根据饮食限制、健康标签和感官偏好进行个性化食谱调整,填补了现有数据集的空白。数据集文件包括食材映射、营养数据、风味图谱、替代对等,适用于烹饪研究和数字美食学领域。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作