five

Anonymized Schema Metadata and Experimental Logs for Privacy-Preserving RAG in Semiconductor Manufacturing

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/anonymized-schema-metadata-and-experimental-logs-privacy-preserving-rag-semiconductor
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is designed to support research on privacy-preserving schema retrieval for Retrieval-Augmented Generation (RAG) under air-gapped environments and strict data sovereignty constraints. It contains only schema-level metadata and relational topology information, without any row-level instance data.Specifically, the dataset comprises 52 relational tables with more than 1,300 columns and a set of 167 domain-specific natural language queries. Answering these queries typically requires multi-hop reasoning involving joins over 3\u20136 tables. To protect sensitive industrial information, all identifiers are anonymized using salted MD5 hashing, while foreign-key relationships and structural connectivity are preserved.This design enables the study of topology-aware retrieval and uncertainty-guided optimization in highly restricted industrial settings, without exposing proprietary semantics or violating confidentiality requirements. 
提供机构:
Hung-Chang Hsiao; Kuan-Yu Chen
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作