Wanderbricks Dataset (DAIS 2025)
收藏Databricks2025-06-21 收录
下载链接:
https://marketplace.databricks.com/details/ed6cf259-81e7-4758-94c5-b444f8a5275a/Databricks_Wanderbricks-Dataset-(DAIS-2025)
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
Welcome to the Wanderbricks dataset, used in the Data + AI Summit 2025 demos. This dataset can be used for any Databricks workload and helps you recreate demos and explore key Databricks Data + AI concepts. All data is synthetic, and any resemblance to real names is coincidental.
**Use cases**
Support for Data Engineering/Data Science/AI use cases, including:
- **Extract/Transform/Load (ETL):** Use the structured data in Wanderbricks to build overview dashbaords of amenities, properties, users, employees, etc.
- **Sentiment Analysis:** Use the unstructured data in customer_support_logs to evaluate the average sentiment of customer support interactions to evaluate the effectiveness of customer support agents.
- **RAG development:** Use the reviews data to build a RAG that can answer questions about customer reviews using natural language.
- **Agent development:** Use all the data to build agents that automate customer response and outreach, given sentiment.
**What data is available?**
Examples of structured data include, but not limited to:
- **wanderbricks.amenities:** Amenities available in a different properties, including wifi, heating, coffee maker, free parking, etc.
- **wanderbricks.bookings:** Booking data, including what user made the booking, to which property, check in and check out dates, total cost, etc.
- **wanderbricks.cities:** A list of cities where properties are located, including city, city_id, country, and a text description
Example of unstructured data include, but are not limited to:
- **booking_updates:** Any modifications made to bookings, in json format
- **customer_support_logs:** Messages between customers and customer support agents in English, in xml format.
- **reviews:** Property reviews in English, in json format
- **payments:** Payment data, including payment method, amount, and payment date, in csv format.
For more details, refer to the embedded notebooks.
**Get started**
Get started by checking out the example notebooks! For any sample data requests, please contact your Databricks account team.
**概述(Overview)**
欢迎使用Wanderbricks数据集,该数据集曾用于2025年数据+人工智能峰会(Data + AI Summit 2025)的演示场景。本数据集可适配任意Databricks工作负载,助力您复现演示案例并探索Databricks核心数据与人工智能技术理念。所有数据均为合成生成,与真实实体名称的雷同纯属巧合。
**应用场景**
支持数据工程、数据科学与人工智能各类应用场景,具体包括:
- **抽取-转换-加载(Extract/Transform/Load,ETL)**:利用Wanderbricks数据集内的结构化数据,可构建设施、房产、用户、员工等主题的概览仪表盘。
- **情感分析**:借助customer_support_logs中的非结构化数据,可评估客户支持交互的平均情感倾向,以此衡量客服人员的工作效能。
- **检索增强生成(Retrieval-Augmented Generation,RAG)开发**:利用评论数据集构建可通过自然语言回答客户评论相关问题的检索增强生成应用。
- **AI智能体(AI Agent)开发**:整合全量数据构建可基于情感倾向自动完成客户响应与触达的智能体。
**可用数据范围**
结构化数据示例(不限于以下类型)包括:
- **wanderbricks.amenities**:不同房产提供的配套设施信息,涵盖无线网络、供暖设备、咖啡机、免费停车等。
- **wanderbricks.bookings**:预订数据,涵盖预订用户、目标房产、入住及退房日期、总消费金额等信息。
- **wanderbricks.cities**:房产所在城市列表,包含城市名称、城市ID、国家及文本描述信息。
非结构化数据示例(不限于以下类型)包括:
- **booking_updates**:以JSON格式存储的预订修改记录。
- **customer_support_logs**:以XML格式存储的英文客户与客服人员交互消息。
- **reviews**:以JSON格式存储的英文房产评论。
- **payments**:以CSV格式存储的支付数据,涵盖支付方式、交易金额及支付日期等信息。
如需了解更多细节,请参阅内嵌的示例笔记本。
**快速上手**
可通过示例笔记本开启使用!如需申请样本数据,请联系您的Databricks客户经理团队。
提供机构:
Databricks
搜集汇总
数据集介绍

背景与挑战
背景概述
Wanderbricks数据集是Databricks Data + AI Summit 2025演示中使用的合成数据集,适用于ETL、情感分析、RAG开发和智能代理构建等多种AI与数据场景。它包含amenities、bookings等结构化数据,以及客户支持日志、评论等非结构化数据,帮助用户探索Databricks平台功能。
以上内容由遇见数据集搜集并总结生成



