Limits of use of social media for monitoring biosecurity events

NIAID Data Ecosystem2026-03-10 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.7jd15

下载链接

链接失效反馈

官方服务：

资源简介：

Compared to applications that trigger massive information streams, like earthquakes and human disease epidemics, the data input for agricultural and environmental biosecurity events (ie. the introduction of unwanted exotic pests and pathogens), is expected to be sparse and less frequent. To investigate if Twitter data can be useful for the detection and monitoring of biosecurity events, we adopted a three-step process. First, we confirmed that sightings of two migratory species, the Bogong moth (Agrotis infusa) and the Common Koel (Eudynamys scolopaceus) are reported on Twitter. Second, we developed search queries to extract the relevant tweets for these species. The queries were based on either the taxonomic name, common name or keywords that are frequently used to describe the species (symptomatic or syndromic). Third, we validated the results using ground truth data. Our results indicate that the common name queries provided a reasonable number of tweets that were related to the ground truth data. The taxonomic query resulted in too small datasets, while the symptomatic queries resulted in large datasets, but with highly variable signal-to-noise ratios. No clear relationship was observed between the tweets from the symptomatic queries and the ground truth data. Comparing the results for the two species showed that the level of familiarity with the species plays a major role. The more familiar the species, the more stable and reliable the Twitter data. This clearly presents a problem for using social media to detect the arrival of an exotic organism of biosecurity concern for which public is unfamiliar.

相较于地震、人类传染病疫情等可催生海量信息流的应用场景，农业与环境生物安全事件（即有害外来害虫与病原体的传入事件）的数据输入规模预计更为稀疏、发生频率更低。为探究推特（Twitter）数据能否用于生物安全事件的检测与监测，本研究采用了三步研究流程：首先，我们验证了推特平台上存在关于两种迁徙物种——波戈尼亚蛾（Agrotis infusa）与噪鹃（Eudynamys scolopaceus）——的目击记录报道；其次，我们构建了针对这两个物种的检索词以提取相关推特内容，这些检索词分别基于物种的分类学名、通用名称，或是用于描述该物种的高频关键词（包括症状性或综合征性描述）；最后，我们利用真实标注数据集（ground truth data）对结果进行了验证。研究结果显示，基于通用名称的检索词可返回与真实标注数据集相关的适量推特内容；基于分类学名的检索词得到的数据集规模过小，而基于症状性描述的检索词虽能生成大规模数据集，但其信噪比波动幅度极大，且基于症状性描述检索词得到的推特内容与真实标注数据集之间未呈现明确关联。对两个物种的结果进行对比后发现，公众对物种的熟悉程度发挥着关键作用：物种的公众熟悉度越高，推特数据的稳定性与可靠性就越强。这显然为利用社交媒体检测公众陌生的、具有生物安全风险的外来生物入侵事件带来了挑战。

创建时间：

2018-02-10