Replication Data for: Measuring and Answering the Challenge of Spurious Correla-tions in Big Search Data
收藏DataONE2023-02-17 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:d3447e25e7dff9ba7b9806aca94ccce8c11083f21c61e1cb0831f01a701ca7c6
下载链接
链接失效反馈官方服务:
资源简介:
Big search data offers the opportunity to identify new and potentially real-time measures and predictors of important political, geographic, social, cultural, economic, and epidemiological phe-nomena, measures that might serve an important role as leading indicators in forecasts and now-casts. However, it also presents vast new risks that scientists or the public will identify meaningless and totally spurious ‘relationships’ between variables. This study is the first to quantify that risk in the context of search data. We find that spurious correlations arise at exceptionally high frequencies for variables following gamma and spatially auto-correlated distributions, and random walks. Quantifying these spurious correlations and their likely magnitude for various distributions has value for several reasons. First, analysts can make progress towards accurate inference. Second, they can avoid unwarranted credulity. Third, they can demand appropriate disclosure from study authors.
创建时间:
2023-11-08



