UQ Single Column Format Inconsistency Datasets
收藏DataCite Commons2026-01-21 更新2025-04-16 收录
下载链接:
https://espace.library.uq.edu.au/view/UQ:0ab54e7
下载链接
链接失效反馈官方服务:
资源简介:
There are three datasets: address (dataset_address.csv), contact number (dataset_contact.csv), and date (dataset_date.csv). Our system, namely "Data-Scanner-4C", generates RegEx for three datasets respectively: address (RegEx_address.txt), contact number (RegEx_contact_number.txt), and date (RegEx_date.txt). The performance of RegEx are presented in Table 2, 3, and 4 in our paper. Please, refer to the readme file for more information.The datasets and the anaylysis results relates to our paper Shaochen Yu, Lei Han, Marta Indulska, Shazia Sadiq, and Gianluca Demartini. Human-in-the-loop Regular Expression Extraction for Single Column Format Inconsistency. In: 32nd ACM International World Wide Web Conference (TheWebConf 2023). Austin, Texas, USA, April 2023. https://doi.org/10.1145/3543507.3583515
提供机构:
The University of Queensland
创建时间:
2023-02-08



