KomeijiForce/Cuckoo_MetaIE_MultiNTE_Fineweb
收藏Hugging Face2025-03-13 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/KomeijiForce/Cuckoo_MetaIE_MultiNTE_Fineweb
下载链接
链接失效反馈官方服务:
资源简介:
Cuckoo数据集是一个用于训练信息提取模型的文本数据集,包含转换为下一个标记提取(NTE)实例的数据。这些数据用于训练Cuckoo模型,该模型能够模仿大规模语言模型的下一个标记预测范式,通过在给定的输入上下文中对标记进行标注来预测下一个标记。
The Cuckoo dataset is a collection of text data for training information extraction models, containing data converted into next-token extraction (NTE) instances. This data is used to train the Cuckoo model, which is capable of mimicking the next-token prediction paradigm of large-scale language models by tagging tokens in the given input context to predict the next tokens.
提供机构:
KomeijiForce



