five

Twitter-LDA

收藏
DataCite Commons2021-03-12 更新2024-07-13 收录
下载链接:
https://researchdata.smu.edu.sg/articles/dataset/Twitter-LDA/12062730
下载链接
链接失效反馈
官方服务:
资源简介:
Latent Dirichlet Allocation (LDA) has been widely used in textual analysis. The original LDA is used to find hidden "topics" in the documents, where a topic is a subject like "arts" or "education" that is discussed in the documents. The original setting in LDA, where each word has a topic label, may not work well with Twitter as tweets are short and a single tweet is more likely to talk about one topic. Hence, Twitter-LDA (T-LDA) has been proposed to address this issue. T-LDA also addresses the noisy nature of tweets, where it captures background words in tweets. As experiments in [7] have shown that T-LDA could capture more meaningful topics than LDA in Microblogs. The original setting in Latent Dirichlet Allocation (LDA), where each word has a topic label, may not work well with Twitter as tweets are short and a single tweet is more likely to talk about one topic. Hence, Twitter-LDA (T-LDA) has been proposed to address this issue. T-LDA also addresses the noisy nature of tweets, where it captures background words in tweets.Related Publication: Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E. P., Yan, H., &amp; Li, X. (2011). Comparing twitter and traditional media using topic models. In <em>Advances in Information Retrieval</em> (pp. 338-349). http://doi.org/10.1007/978-3-642-20161-5_34
提供机构:
SMU Research Data Repository (RDR)
创建时间:
2020-04-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作