infinite-dataset-hub/ContextualEmbeddings

Name: infinite-dataset-hub/ContextualEmbeddings
Creator: infinite-dataset-hub
Published: 2024-08-23 08:36:48
License: 暂无描述

Hugging Face2024-08-23 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/infinite-dataset-hub/ContextualEmbeddings

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit tags: - infinite-dataset-hub - synthetic --- # ContextualEmbeddings tags: natural_language_processing, embeddings, contextualization _Note: This is an AI-generated dataset so its content may be inaccurate or false_ **Dataset Description:** The 'ContextualEmbeddings' dataset comprises of paragraphs of text extracted from diverse sources with corresponding labels that categorize the nature of contextual information required. The texts are processed using advanced NLP techniques to create contextualized embeddings which aid in understanding the semantic relationships within the text. Labels are assigned based on the type of contextualization, such as 'Historical', 'Cultural', 'Scientific', 'Technological', and 'Literary', to classify the paragraphs effectively. **CSV Content Preview:** ```csv id,text,labels 1,"The Enlightenment period was a significant era in human history, characterized by intellectual advancements and a shift towards scientific thought.",Historical 2,"India's rich tapestry of culture is evident in its diverse traditions, languages, and cuisines that have evolved over millennia.",Cultural 3,"The discovery of the DNA double helix by James Watson and Francis Crick in 1953 revolutionized the field of biology and genetics.",Scientific 4,"Advancements in AI and machine learning have transformed industries, creating new technologies and opportunities for innovation.",Technological 5,"F. Scott Fitzgerald's novel 'The Great Gatsby' provides a critical examination of the American Dream during the Roaring Twenties.",Literary ``` Please note that this dataset and CSV content are fictional and created for illustrative purposes based on the keywords provided. The labels are invented and may not represent real categories for contextualization extraction. **Source of the data:** The dataset was generated using the [Infinite Dataset Hub](https://huggingface.co/spaces/infinite-dataset-hub/infinite-dataset-hub) and microsoft/Phi-3-mini-4k-instruct using the query 'contextualization extraction from a paragraph': - **Dataset Generation Page**: https://huggingface.co/spaces/infinite-dataset-hub/infinite-dataset-hub?q=contextualization+extraction+from+a+paragraph&dataset=ContextualEmbeddings&tags=natural_language_processing,+embeddings,+contextualization - **Model**: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct - **More Datasets**: https://huggingface.co/datasets?other=infinite-dataset-hub

提供机构：

infinite-dataset-hub

5,000+

优质数据集

54 个

任务类型

进入经典数据集