five

"SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing"

收藏
DataCite Commons2025-06-30 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/sensorbench-benchmarking-llms-coding-based-sensor-processing
下载链接
链接失效反馈
官方服务:
资源简介:
"Effective sensor data processing is critical for cyber-physical and IoT systems but often requires specialized expertise. While Large Language Models (LLMs) and Large Reasoning Models (LRMs) show promise as autonomous copilots for sensor processing, their capabilities remain underexplored. We introduce SensorBench, the first comprehensive benchmark for evaluating LLMs across diverse real-world sensor datasets and tasks. SensorBench evaluates three paradigms for leveraging LLMs in sensing tasks: Coding with API access (CA), Coding without API access (CNA), and Direct Answering (DA). Our analysis reveals that: (1) CA significantly outperforms CNA and DA; (2) LLMs excel at simple tasks but consistently underperform domain experts on compositional tasks requiring parameter tuning and multi-step reasoning. (3) The reasoning mechanism introduced in LRMs does not yield substantial performance gains.To improve the performance, we explore four prompting strategies and fine-tuning approaches (using our newly released sensor-processing corpus). The results show that self-verification prompting proves most effective, outperforming other methods simultaneously in 48\\% of tasks, while fine-tuning yields marginal gains. Our analysis suggests that more sophisticated interaction frameworks, such as signal-level self-verification, may bridge the gap to human expert-level performance. This benchmark provides a foundation for evaluating and improving LLMs in sensing applications"
提供机构:
IEEE DataPort
创建时间:
2025-06-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作