"SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing"

Name: "SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing"
Creator: IEEE DataPort
Published: 2025-06-30 03:39:58
License: 暂无描述

DataCite Commons2025-06-30 更新2026-05-03 收录

下载链接：

https://ieee-dataport.org/documents/sensorbench-benchmarking-llms-coding-based-sensor-processing

下载链接

链接失效反馈

官方服务：

资源简介：

"Effective sensor data processing is critical for cyber-physical and IoT systems but often requires specialized expertise. While Large Language Models (LLMs) and Large Reasoning Models (LRMs) show promise as autonomous copilots for sensor processing, their capabilities remain underexplored. We introduce SensorBench, the first comprehensive benchmark for evaluating LLMs across diverse real-world sensor datasets and tasks. SensorBench evaluates three paradigms for leveraging LLMs in sensing tasks: Coding with API access (CA), Coding without API access (CNA), and Direct Answering (DA). Our analysis reveals that: (1) CA significantly outperforms CNA and DA; (2) LLMs excel at simple tasks but consistently underperform domain experts on compositional tasks requiring parameter tuning and multi-step reasoning. (3) The reasoning mechanism introduced in LRMs does not yield substantial performance gains.To improve the performance, we explore four prompting strategies and fine-tuning approaches (using our newly released sensor-processing corpus). The results show that self-verification prompting proves most effective, outperforming other methods simultaneously in 48\\% of tasks, while fine-tuning yields marginal gains. Our analysis suggests that more sophisticated interaction frameworks, such as signal-level self-verification, may bridge the gap to human expert-level performance. This benchmark provides a foundation for evaluating and improving LLMs in sensing applications"

提供机构：

IEEE DataPort

创建时间：

2025-06-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集