theguywhosucks/mocha
收藏Hugging Face2025-08-25 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/theguywhosucks/mocha
下载链接
链接失效反馈官方服务:
资源简介:
Instruct Mocha是一个完全从头开始训练的小型GPT模型,它使用一般文本和可选的Q&A对组成的混合数据集进行训练。该模型能够生成随机但有风格的文本,适用于实验目的,可以在有限的资源上进行训练,如Colab GPU或仅CPU的MacBook。数据集包含训练文本、验证文本和可选的测试文本,并且原始文本在预处理前存放在raw文件夹中。
Instruct Mocha is a tiny GPT model trained completely from scratch on a hybrid dataset consisting of general text and optional Q&A pairs. The model is capable of generating random but styled text and is designed for experimental purposes, suitable for training on limited resources like Colab GPUs or CPU-only MacBook. The dataset includes training text, validation text, and optional test text, with the original text files stored in the raw folder before preprocessing.
提供机构:
theguywhosucks



