five

海关风险规则知识库

收藏
国家基础学科公共科学数据中心2025-08-30 收录
下载链接:
https://nbsdc.cn/general/dataDetail?id=68989a7c195d26317b036f13&type=1
下载链接
链接失效反馈
官方服务:
资源简介:
海关风险规则知识库是针对互联网采集数据源、海关业务大数据资源的挖掘而形成的海关风险知识与风险规则,覆盖海关安全准入、税收征管、疫情防控、保税研发监管、RCEP贸易协定和跨境电商六大场景,以大宗商品交易数据、互联网交易数据、跨境电商交易数据、传染性疾病事件数据等为原始数据源,于2024年在北京通过中标麒麟7.6操作系统使用爬虫技术进行采集的商品名称、价格、销量、日期、疾病名称、时间、地点、风险级别等信息,再经过自然语言处理技术进行知识挖掘和规则挖掘所生成,从而构建成海关风险规则知识库,用于进出口贸易风险智能化研判研究。每个需求采集数据的时间范围、时间精度、空间范围、空间精度各不相同,有一次性的法规性数据、按月更新的价格数据、按天更新的价格数据以及不定时更新的新闻数据。本数据库表只包含两个字段,ID和CONTENT,CONTENT字段以json字符串的形式存储了采集的原始数据,不同来源的采集数据CONTENT字段内存储的json字符串解析后的数据结构不同,但是同一来源的采集数据CONTENT字段内存储的json字符串解析后的数据结构是严格相同的。本数据库是根据海关税收征管与风险甄别防控技术研究及应用示范项目海关风险规则知识库研究与构建子课题的业务流程及实际网络环境设计的,使用时需要使用ETL工具读取数据,ETL工具根据CONTENT字段中json字符串解析后的不同数据结构设置不同的解析模版。

The Customs Risk Rule Knowledge Base is a repository of customs risk knowledge and risk rules developed through mining internet-collected data sources and big data resources from customs operations. It covers six scenarios: customs security access, tax collection and administration, epidemic prevention and control, bonded R&D supervision, RCEP trade agreement implementation, and cross-border e-commerce. Taking bulk commodity transaction data, internet transaction data, cross-border e-commerce transaction data, infectious disease event data and other datasets as original sources, it collected information including commodity names, prices, sales volumes, dates, disease names, timestamps, locations, risk levels and other relevant data in Beijing in 2024 using crawler technology on the NeoKylin 7.6 operating system. This information was subsequently processed via natural language processing (NLP) technologies for knowledge and rule mining to construct the final knowledge base, which is intended for intelligent research on import and export trade risk assessment and analysis. The time range, temporal precision, spatial range and spatial precision of the collected data vary depending on different requirements. The database includes one-time regulatory data, monthly-updated price data, daily-updated price data, and irregularly-updated news data. This database table only contains two fields: "ID" and "CONTENT". The CONTENT field stores the collected raw data in JSON string format. The parsed data structures of the JSON strings stored in the CONTENT field differ across various data sources, but the parsed structures of JSON strings from the same source are strictly consistent. This database was designed based on the business workflows and actual network environment of the "Customs Risk Rule Knowledge Base Research and Construction Sub-project" under the overall project titled "Research and Application Demonstration of Customs Tax Collection, Administration and Risk Identification and Prevention Technologies". When utilizing this database, ETL (Extract, Transform, Load) tools are required to read the data, and the ETL tools need to configure distinct parsing templates according to the different parsed data structures of the JSON strings within the CONTENT field.
提供机构:
全国海关信息中心
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个海关风险规则知识库,覆盖六大海关业务场景,通过挖掘互联网和海关业务数据构建,用于进出口贸易风险智能化研判。数据以json字符串形式存储,结构因来源不同而异,需使用ETL工具解析。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务