five

Ambient Conversational Thermal Comfort (ACTC) Dataset

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://doi.org/10.7910/DVN/XKTF2L
下载链接
链接失效反馈
官方服务:
资源简介:
Ambient Conversational Thermal Comfort (ACTC) Dataset Description This dataset provides a benchmark corpus of 3,750 text messages for Natural Language Processing (NLP) tasks related to occupant thermal comfort in intelligent building environments. It was developed to address the critical gap in publicly available, domain-specific data for training and validating models that can understand ambient, unstructured human feedback about thermal sensations. The corpus was created to support research in the emerging field of ambient intelligence for building control. Unlike datasets focused on direct commands, this corpus is designed to capture the nuances of passive, conversational feedback, enabling the development of truly perceptive and human-centric building control systems. Key Features: Balanced Three-Class Structure: The dataset is perfectly balanced, containing exactly 1,250 samples for each of the three target classes: Hot, Cold, and Neutral. This structure is ideal for training and evaluating classification models without concerns of class imbalance. Rich Semantic Diversity: The messages go far beyond simple keywords. The corpus was generated from 30 distinct situational scenarios to elicit a wide spectrum of expressions, including indirect descriptions of physiological symptoms (e.g., "My skin feels so dry"), environmental observations (e.g., "It feels a bit stuffy in here"), and action-oriented requests (e.g., "Can we crack a window?"). Contextual Grounding: The scenario-based generation ensures that the messages are grounded in realistic office contexts, from crowded meeting rooms to drafts from a window, providing a robust testbed for contextual language understanding. Hybrid Generation Methodology: The corpus was created using a two-stage, human-machine collaborative process. An initial set of authentic messages was authored by human writers for each scenario, which were then augmented using a state-of-the-art paraphrasing model to increase linguistic diversity and scale. Dataset Structure: The data is provided in a single CSV file (thermal_comfort_corpus.csv) with the following columns: message: (string) The text of the occupant's feedback. label: (integer) The ground-truth classification of the message's intent, mapped as follows: 0: Hot 1: Cold 2: Neutral
创建时间:
2025-11-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作