five

UQ Single Column Format Inconsistency Datasets

收藏
DataCite Commons2026-01-21 更新2025-04-16 收录
下载链接:
https://espace.library.uq.edu.au/view/UQ:0ab54e7
下载链接
链接失效反馈
官方服务:
资源简介:
There are three datasets: address (dataset_address.csv), contact number (dataset_contact.csv), and date (dataset_date.csv). Our system, namely "Data-Scanner-4C", generates RegEx for three datasets respectively: address (RegEx_address.txt), contact number (RegEx_contact_number.txt), and date (RegEx_date.txt). The performance of RegEx are presented in Table 2, 3, and 4 in our paper. Please, refer to the readme file for more information.The datasets and the anaylysis results relates to our paper Shaochen Yu, Lei Han, Marta Indulska, Shazia Sadiq, and Gianluca Demartini. Human-in-the-loop Regular Expression Extraction for Single Column Format Inconsistency. In: 32nd ACM International World Wide Web Conference (TheWebConf 2023). Austin, Texas, USA, April 2023. https://doi.org/10.1145/3543507.3583515
提供机构:
The University of Queensland
创建时间:
2023-02-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作