The Pile: An 800GB dataset of diverse text for language modeling
收藏DataCite Commons2025-01-02 更新2025-04-16 收录
下载链接:
https://service.tib.eu/ldmservice/dataset/5789c506-667d-4102-b310-63bea7ac94dc
下载链接
链接失效反馈官方服务:
资源简介:
Pile is a dataset of text, consisting of 800GB of diverse text.
提供机构:
TIB
创建时间:
2025-01-02



