five

Toward Automated Worldwide Monitoring of Network-level Censorship

收藏
DataCite Commons2020-08-27 更新2025-04-16 收录
下载链接:
https://kilthub.cmu.edu/articles/Toward_Automated_Worldwide_Monitoring_of_Network-level_Censorship/7571876
下载链接
链接失效反馈
官方服务:
资源简介:
Although Internet censorship is a well-studied topic, to date most published studies have focused<br>on a single aspect of the phenomenon, using methods and sources specific to each researcher.<br>Results are di cult to compare, and global, historical perspectives are rare. Because each group<br>maintains their own software, erroneous methods may continue to be used long after the error has<br>been discovered. Because censors continually update their equipment and blacklists, it may be<br>impossible to reproduce historical results even with the same vantage points and testing software.<br>Because “probe lists” of potentially censored material are labor-intensive to compile, requiring an<br>understanding of the politics and culture of each country studied, researchers discover only the most<br>obvious and long-lasting cases of censorship.<br>In this dissertation I will show that it is possible to make progress toward addressing all of<br>these problems at once. I will present a proof-of concept monitoring system designed to operate<br>continuously, in as many di erent countries as possible, using the best known techniques for<br>detection and analysis. I will also demonstrate improved techniques for verifying the geographic<br>location of a monitoring vantage point; for distinguishing innocuous network problems from<br>censorship and other malicious network interference; and for discovering new web pages that are<br>closely related to known-censored pages. These techniques improve the accuracy of a continuous<br>monitoring system and reduce the manual labor required to operate it.<br>This research has, in addition, already led to new discoveries. For example, I have confirmed<br>reports that a commonly-used heuristic is too sensitive and will mischaracterize a wide variety of<br>unrelated problems as censorship. I have been able to identify a few cases of political censorship<br>within a much longer list of cases of moralizing censorship. I have expanded small seed groups of<br>politically sensitive documents into larger groups of documents to test for censorship. Finally, I<br>can also detect other forms of network interference with a totalitarian motive, such as injection of<br>surveillance scripts.<br>In summary, this work demonstrates that mostly-automated measurements of Internet censorship<br>on a worldwide scale are feasible, and that the elusive global and historical perspective is within<br>reach.

尽管互联网审查(Internet censorship)是一个被广泛研究的议题,但迄今为止多数已发表的研究均聚焦于该现象的单一维度,且各研究采用的方法与数据源均为研究者专属。此类研究结果难以横向比较,而具备全球视野与历史维度的研究则寥寥无几。由于各研究团队均维护各自独立的软件系统,即便错误方法已被发现,其仍可能被长期沿用。此外,由于审查方会持续更新其检测设备与黑名单,即便采用相同的观测点与测试软件,也可能无法复现历史观测结果。另外,针对潜在审查内容的“探针列表(probe lists)”构建工作耗时费力,且需要深入理解所研究国家的政治与文化语境,因此研究者往往仅能发现最为显性且长期存在的互联网审查案例。 在本论文中,笔者将论证可通过统一方案同时解决上述所有问题。本文将提出一套概念验证(proof-of-concept)型监测系统,该系统可依托当前最优的检测与分析技术,在尽可能多的国家中持续运行。此外,本文还将展示多项改进后的技术:包括验证监测观测点地理位置的方法、区分无害网络故障与互联网审查及其他恶意网络干扰的手段,以及发现与已知被审查网页高度相关的新网页的策略。这些技术可提升持续监测系统的精度,并降低其运行所需的人工成本。 此外,本研究已取得多项新发现。例如,笔者证实了相关报告:当前广泛使用的启发式方法(heuristic)检测灵敏度偏高,会将大量无关的网络问题误判为互联网审查。在数量庞大的道德化审查案例列表中,笔者成功甄别出少量政治性互联网审查案例。笔者还将少量政治性敏感文档种子集扩展为更大规模的文档组,用于开展互联网审查检测实验。最后,本研究还可检测其他带有极权主义动机的网络干扰行为,例如注入监控脚本(surveillance scripts)。 综上,本研究证明,在全球范围内开展半自动化的互联网审查测量是可行的,而长期难以实现的全球视野与历史维度研究目标也已触手可及。
提供机构:
Carnegie Mellon University
创建时间:
2019-01-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作