five

农兽药类拼多多直播带货违法监测预警数据

收藏
浙江省数据知识产权登记平台2024-08-22 更新2024-08-23 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/54181
下载链接
链接失效反馈
官方服务:
资源简介:
对采集的拼多多平台带货品类为农兽药类的达人直播视频内容进行转译分析,对达人口播语言内容进行处理、分析,根据达人在直播过程中对预先设置的违规敏感词(比如:无毒,无害,100%安全,无残留,保证高产,立竿见影,无效退款,保险公司保险,2小时见效,权威机构专业认证,农业专家强烈推荐,无效包退等)违反的次数和频率,依据触发条件规则提出警告或处理。为诸暨市市场监督局管理区域内规范企业拼多多直播行为,提供数据支持。将采集完成的直播视频进行进行预处理,第一步:基于原始视频文件,以最大10分钟单位对原始视频进行切片。第二步:对于已完成的切片视频,进行视频内容转语音操作。第三步:对于已完成视频转语音操作的切片,进行语音转文本操作。第四步:使用OCR技术对原始视频中抓取的图片进行文字提取操作。第五步:将所得到的文字内容与违法预警关键词库进行匹配。最终运用多标准决策分析模型,对主播在直播过程中出现的违规语句进行分析计算,得出违法预警值和是否预警判断。 违法预警值=(违法预警单关键词命中次数*0.25)+(违法预警组合关键词命中次数* 0.3)+(图片识别命中预警组合关键词个数*0.35)+(直播间近一个月历史违规记录数*0.1) 通过公式计算出最终违法预警值,违法预警值 ≤1 时,不触发预警提示,违法预警值 >1 时触发违法预警提示。

This dataset focuses on the transcription and analysis of influencer live stream video content in the category of agricultural and veterinary drugs sold on the Pinduoduo platform, followed by processing and analysis of the influencers' spoken language content. Warnings or penalties will be issued based on the frequency and count of violations of pre-set sensitive prohibited keywords (including non-toxic, harmless, 100% safe, residue-free, guarantee high yield, immediate effect, money-back if ineffective, covered by insurance companies, takes effect within 2 hours, professional certification from authoritative institutions, strongly recommended by agricultural experts, full refund if no effect, etc.) during live broadcasts. This work provides data support for standardizing the live streaming behaviors of enterprises within the jurisdiction of the Zhuji Municipal Market Supervision Administration via the Pinduoduo platform. The collected live stream videos undergo the following preprocessing workflow: 1. Segment the original video files into clips with a maximum duration of 10 minutes per clip; 2. Convert the segmented video content to speech; 3. Perform speech-to-text conversion on the speech-extracted video clips; 4. Extract text from images captured from the original videos using OCR technology; 5. Match the acquired text content against the prohibited early-warning keyword database. Finally, the multi-criteria decision analysis (MCDA) model is employed to analyze and calculate the prohibited statements made by the streamer during the live broadcast, to generate the early-warning violation score and determine whether an alert should be triggered. The early-warning violation score is calculated using the following formula: Early-warning violation score = (Number of hits for single prohibited early-warning keyword × 0.25) + (Number of hits for combined prohibited early-warning keywords × 0.3) + (Number of combined early-warning keywords detected via image recognition × 0.35) + (Number of historical violation records of the live stream room in the past month × 0.1) The final early-warning violation score is computed via the above formula. No early-warning alert will be triggered if the score is ≤ 1, while an early-warning alert will be activated if the score > 1.
提供机构:
诸暨市市场监督管理局,浙江富润数链科技有限公司
创建时间:
2024-07-22
搜集汇总
数据集介绍
main_image_url
特点
该数据集用于监测拼多多平台农兽药类直播带货中的违法行为,包含1464条记录,通过关键词匹配和算法计算违法预警值,为市场监督管理局提供数据支持。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务