five

Top News: Story URLs and Text from News Feeds of Major National News Sites (2022 to 03/2025)

收藏
DataCite Commons2025-05-12 更新2025-04-15 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/ZNAKK6
下载链接
链接失效反馈
官方服务:
资源简介:
Scripts at: <a href = "https://github.com/notnews/top_news"> https://github.com/notnews/top_news</a>. We check the RSS Feeds from the major news sites: ABC, CBS, CNN, LA Times, NBC, NPR, NYT, Politico, ProPublica, USA Today, and WaPo and get their URLs and then parse the data using newspaper3k and some custom scripts. To combine usat_html, <br> <code> cat usat_split_* > usat_html_articles_03_25.tar.gz</code> <br> <b>Related Data</b> <ul> <li><a href = "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ISDPJU">CNN Transcripts</a> <li><a href = "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/UPJDE1">MSNBC Transcripts</a> <li><a href = "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/Q2KIES">Fox News Transcripts</a> </ul>
提供机构:
Harvard Dataverse
创建时间:
2020-09-02
二维码
社区交流群
二维码
科研交流群
商业服务