Replication Data for: Hindi-English code-mixed Twitter dataset
收藏DataONE2023-04-19 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:d5e03de36e3611b4cbd2b2213675e3f7103f034ea1c0657956e0aeb80a72e0d2
下载链接
链接失效反馈官方服务:
资源简介:
This directory contains a large-scale Hindi-English code-mixed corpus collected from Twitter between 2010-2022. We have removed the identifiers for anonymizing the dataset. We have de-anonymized the tweet author ids. Additionally, we have calculated code-mixing index (CMI) and the language of the texts (Hindi, English or, Hindi-English code-mixed).
创建时间:
2024-02-07



