five

在线网络犯罪市场中的产品识别数据集

收藏
arXiv2017-08-31 更新2024-06-21 收录
下载链接:
https://evidencebasedsecurity.org/forums/
下载链接
链接失效反馈
官方服务:
资源简介:
本数据集由德克萨斯大学奥斯汀分校的研究团队创建,专注于从网络犯罪论坛中识别买卖的产品。数据集包含1938条帖子,来自四个不同的论坛,每个论坛代表一个细粒度的领域,涉及不同的市场部门和属性。创建过程中,团队通过六轮初步注释和讨论,制定了详细的注释指南,并由具备NLP或计算机安全背景的研究人员进行注释。该数据集的应用领域包括快速分析论坛内容,识别产品趋势,关联用户活动,以及结合价格信息更好地理解市场。旨在解决机器学习模型在跨领域数据上表现不佳的问题,特别是在网络犯罪领域的应用。

This dataset was created by a research team from The University of Texas at Austin, focusing on identifying traded products from cybercrime forums. It consists of 1938 posts sourced from four distinct forums, each representing a fine-grained domain covering different market sectors and attributes. During the dataset construction, the team developed a detailed annotation guideline through six rounds of preliminary annotations and discussions, with annotations performed by researchers with backgrounds in natural language processing (NLP) or computer security. Its potential application scenarios include rapid forum content analysis, product trend identification, user activity correlation, and integrating price information to better understand the market. This dataset aims to address the issue that machine learning models perform poorly on cross-domain data, particularly for applications in the cybercrime domain.
提供机构:
德克萨斯大学奥斯汀分校
创建时间:
2017-08-31
二维码
社区交流群
二维码
科研交流群
商业服务