oscar-mini
收藏OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/oscar-mini
下载链接
链接失效反馈官方服务:
资源简介:
OSCAR or Open Super-large Crawled ALMAnaCH coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the goclassy architecture. Data is distributed by language in both original and deduplicated form.
提供机构:
OpenDataLab
创建时间:
2024-01-10



