Tokenized OFAI Million Post Corpus
收藏DataCite Commons2024-08-06 更新2025-04-15 收录
下载链接:
https://fdat.uni-tuebingen.de/records/jy474-fv283
下载链接
链接失效反馈官方服务:
资源简介:
This corpus is based on the Million Post Corpus created by the OFAI. It contains the tokenized comments and articles in plain text without association of comments to their articles. The text has been tokenized using the SoMaJo tokenizer.
提供机构:
University of Tübingen
创建时间:
2024-08-06



