Top News: Story URLs and Text from News Feeds of Major National News Sites (2022 to 03/2025)
收藏DataCite Commons2025-05-12 更新2025-04-15 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/ZNAKK6
下载链接
链接失效反馈官方服务:
资源简介:
Scripts at: <a href = "https://github.com/notnews/top_news"> https://github.com/notnews/top_news</a>. We check the RSS Feeds from the major news sites: ABC, CBS, CNN, LA Times, NBC, NPR, NYT, Politico, ProPublica, USA Today, and WaPo and get their URLs and then parse the data using newspaper3k and some custom scripts.
To combine usat_html,
<br>
<code>
cat usat_split_* > usat_html_articles_03_25.tar.gz</code>
<br>
<b>Related Data</b>
<ul>
<li><a href = "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ISDPJU">CNN Transcripts</a>
<li><a href = "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/UPJDE1">MSNBC Transcripts</a>
<li><a href = "https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/Q2KIES">Fox News Transcripts</a>
</ul>
提供机构:
Harvard Dataverse
创建时间:
2020-09-02



