agentlans/sentence-paraphrases
收藏Hugging Face2025-01-19 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/agentlans/sentence-paraphrases
下载链接
链接失效反馈官方服务:
资源简介:
句子重写数据集是一个经过精心策划的句子长度重写句子的集合,主要来源于两个数据集:humarin的ChatGPT-paraphrases和xwjzds的paraphrase_collections。该数据集的结构旨在提供原始文本及其重写句子的成对句子。每个条目包括一个文本字段,其中包含最不直观的重写句子,以及一个重写文本字段,其中包含最直观的重写句子。直观性是通过agentlans/deberta-v3-xsmall-zyda-2-readability模型评估的。在处理过程中,移除了重复的行和不正确的输出,并根据直观性分数选择了重写句子。
This dataset is a curated collection of sentence-length paraphrases derived from two primary sources: humarins ChatGPT-paraphrases and xwjzdss paraphrase_collections. The dataset is structured to provide pairs of sentences from an original text and its paraphrase(s). Each entry includes a text field containing the least readable paraphrase and a paraphrase field containing the most readable paraphrase, assessed using the agentlans/deberta-v3-xsmall-zyda-2-readability model. During processing, duplicate rows and incorrect outputs were removed, and paraphrases were selected based on readability scores.
提供机构:
agentlans



