Multi Token Completion
收藏aws亚马逊开源数据集2024-03-07 收录
下载链接:
https://registry.opendata.aws/multi-token-completion
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides masked sentences and multi-token phrases that were masked-out of these sentences. We offer 3 datasets: a general purpose dataset extracted from the Wikipedia and Books corpora, and 2 additional datasets extracted from pubmed abstracts. As for the pubmed data, please be aware that the dataset does not reflect the most current/accurate data available from NLM (it is not being updated). For these datasets, the columns provided for each datapoint are as follows: text- the original sentence span- the span (phrase) which is masked out span_lower- the lowercase version of span r...
提供机构:
Amazon



