infinite-dataset-hub/TextTransformerDataset
收藏Hugging Face2024-08-27 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/infinite-dataset-hub/TextTransformerDataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
tags:
- infinite-dataset-hub
- synthetic
---
# TextTransformerDataset
tags: Natural Language Processing, Text Transformation, Machine Learning
_Note: This is an AI-generated dataset so its content may be inaccurate or false_
**Dataset Description:** The 'TextTransformerDataset' CSV file contains examples of text transformation tasks often used in Natural Language Processing (NLP) and Machine Learning (ML) applications. The dataset includes raw texts and their transformed counterparts, where each transformation serves as a labeled example for training a model to perform text-to-text tasks such as paraphrasing, summarization, or translation. The 'label' column indicates the type of transformation that has been applied.
**CSV Content Preview:**
```
text,label,transformed_text
"The quick brown fox jumps over the lazy dog","paraphrase","A swift auburn fox leaps above a sluggish canine"
"Artificial intelligence is revolutionizing industries.","summary","AI is transforming various sectors."
"Machine learning algorithms can learn from data.","translation_en_to_fr","Les algorithmes d'apprentissage automatique peuvent apprendre à partir de données."
"Solar energy is a clean power source.","synonym","Sunlight is a renewable energy form."
"Understanding context is key in language models.","explanation","Comprehending the context is crucial for language models."
```
This CSV provides a diverse set of text samples that can be used to train and evaluate NLP models capable of performing different text transformation tasks. Each entry in the dataset has a unique label that describes the transformation it has undergone, allowing for better training of the TextTransformer model.
**Source of the data:**
The dataset was generated using the [Infinite Dataset Hub](https://huggingface.co/spaces/infinite-dataset-hub/infinite-dataset-hub) and microsoft/Phi-3-mini-4k-instruct using the query 'text to text':
- **Dataset Generation Page**: https://huggingface.co/spaces/infinite-dataset-hub/infinite-dataset-hub?q=text+to+text&dataset=TextTransformerDataset&tags=Natural+Language+Processing,+Text+Transformation,+Machine+Learning
- **Model**: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
- **More Datasets**: https://huggingface.co/datasets?other=infinite-dataset-hub
---
许可证:MIT许可证
标签:
- 无限数据集中心(Infinite Dataset Hub)
- 合成数据集
---
# TextTransformerDataset(文本Transformer数据集)
标签:自然语言处理(Natural Language Processing,NLP)、文本转换、机器学习(Machine Learning,ML)
**注意:本数据集由人工智能生成,其内容可能存在不准确或虚假信息**
**数据集说明**:本`TextTransformerDataset`(文本Transformer数据集)的CSV文件包含常用于自然语言处理(Natural Language Processing,NLP)与机器学习(Machine Learning,ML)应用场景的文本转换任务示例。数据集涵盖原始文本及其转换后版本,每一条转换样本均为带标注的训练示例,可用于训练能够完成释义、摘要、机器翻译等文本到文本任务的模型。其中的`label`列标注了所应用的转换类型。
**CSV内容预览**:
原始文本(text),转换类型(label),转换后文本(transformed_text)
"The quick brown fox jumps over the lazy dog","paraphrase","A swift auburn fox leaps above a sluggish canine"
"Artificial intelligence is revolutionizing industries.","summary","AI is transforming various sectors."
"Machine learning algorithms can learn from data.","translation_en_to_fr","Les algorithmes d'apprentissage automatique peuvent apprendre à partir de données."
"Solar energy is a clean power source.","synonym","Sunlight is a renewable energy form."
"Understanding context is key in language models.","explanation","Comprehending the context is crucial for language models."
本CSV文件包含多样化的文本样本,可用于训练和评估能够完成各类文本转换任务的自然语言处理模型。数据集中的每一条样本均带有唯一的标注,用于说明其经历的转换类型,有助于更高效地训练文本Transformer模型。
**数据来源**:
本数据集通过[无限数据集中心(Infinite Dataset Hub)](https://huggingface.co/spaces/infinite-dataset-hub/infinite-dataset-hub)与微软(Microsoft)Phi-3-mini-4k-instruct模型,基于查询“text to text”生成:
- **数据集生成页面**:https://huggingface.co/spaces/infinite-dataset-hub/infinite-dataset-hub?q=text+to+text&dataset=TextTransformerDataset&tags=Natural+Language+Processing,+Text+Transformation,+Machine+Learning
- **所用模型**:https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
- **更多数据集**:https://huggingface.co/datasets?other=infinite-dataset-hub
提供机构:
infinite-dataset-hub



