five

FloraGuard Online Illegal Plant Trade Forum Data, 2006-2019

收藏
DataCite Commons2022-01-10 更新2025-04-16 收录
下载链接:
http://reshare.ukdataservice.ac.uk/id/eprint/854595
下载链接
链接失效反馈
官方服务:
资源简介:
The Crawling software: DARPA MEMEX Undercrawler has been used to create a dataset of HTML posts as part of the FloraGuard project for the purposes of studying the online illegal plant trade. HTML Posts in the created dataset contain personal data and may contain evidence of criminality around the online illegal trade in plants. In total nine wildlife trade related forums and marketplaces were crawled, providing 13,697 posts by 4,009 authors in 1,826 forum threads. Posts dated from 2006 to 2019. The Crawling software: DARPA MEMEX Undercrawler is available via Related Resources. Dataset includes processed versions of this raw data, including JSON files of extracted text and metadata, and JSON files of clausal text extracted using OpenIE algorothms. This dataset was used to produce results for the below published work: Middleton, S.E. Lavorgna, A. Neumann, G. Whitehead, D. Information Extraction from the Long Tail: A Socio-Technical AI Approach for Criminology Investigations into the Online Illegal Plant Trade. In Proceedings of ACM Web Science conference (WebSci 2020). ACM, July 6–10, 2020, Southampton, United Kingdom. 4 pages. https://doi.org/10.1145/3394332.3402838
提供机构:
UK Data Service
创建时间:
2022-01-10
二维码
社区交流群
二维码
科研交流群
商业服务