five

A Parser for News Downloads

收藏
Figshare2018-03-01 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/A_Parser_for_News_Downloads/6388691
下载链接
链接失效反馈
官方服务:
资源简介:
ABSTRACT This paper presents the Download Parser, a tool for handling text downloads from large online databases. Many universities have access to full-text databases which allow the user to search their holdings and then view and ideally download the full text of relevant articles, but there are important problems in practice in managing such downloads, because of factors such as duplication, unevenness of formatting standards, lack of documentation. The tool under discussion was devised to parse downloads, clean them up and standardise them, identify headlines and insert suitably marked-up headers for corpus analysis.
创建时间:
2018-03-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作