ZEROGEN

Name: ZEROGEN
Creator: 上海人工智能实验室
Published: 2022-10-22 09:32:03
License: 暂无描述

arXiv2022-10-22 更新2024-06-21 收录

下载链接：

https://github.com/HKUNLP/ZeroGen

下载链接

链接失效反馈

官方服务：

资源简介：

ZEROGEN是由上海人工智能实验室创建的数据集，用于零样本学习研究。该数据集通过大规模预训练语言模型（PLMs）生成，包含约20万条数据，无需人工标注。数据集的创建过程涉及使用特定任务提示引导PLMs生成训练数据，然后训练小型任务模型（如LSTM）。ZEROGEN主要应用于自然语言处理领域，旨在解决零样本学习问题，即模型在未见过的任务上的性能表现。

ZEROGEN is a dataset developed by the Shanghai AI Laboratory for zero-shot learning research. It is generated using large-scale pre-trained language models (PLMs), containing approximately 200,000 data samples without manual annotation. The dataset creation process involves using task-specific prompts to guide PLMs in generating training data, followed by the training of small task-specific models such as LSTMs. ZEROGEN is mainly applied in the field of natural language processing, aiming to solve the zero-shot learning problem, which refers to the performance of models on tasks they have not encountered during training.

提供机构：

上海人工智能实验室

创建时间：

2022-02-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集