MCodeScript
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/zcgbnn8zbm
下载链接
链接失效反馈官方服务:
资源简介:
MCodeScript, a suite of unsupervised and supervised Marathi-English code-mixed and script-mixed datasets with comments collected from various sources, such as social sites, community groups, and news websites. McodeScript-un is unsupervised gold dataset with 1.3L code-mixed and script-mixed comments. McodeScript-un-annot is transformed version of McodeScript-un by automated procedure. The MCodeScript-MeSent, a transformed MeSent supervised Romanized Marathi dataset into a script-mixed format through our automated process.
创建时间:
2026-01-14



