MongoDB/mongodb-docs-embedded
收藏Hugging Face2025-01-15 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/MongoDB/mongodb-docs-embedded
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了MongoDB技术文档的一个小型的子集,这些文档已经被处理成块状和嵌入式版本。数据集的字段包括文档来源、链接、操作、Markdown格式的文章内容、内容格式、文档的元数据(如标签、内容类型等)、标题、最后更新日期以及使用Hugging Face的thenlpr/gte-small模型创建的内容嵌入。这个数据集可以用于原型设计检索增强生成(RAG)应用程序。它遵循cc-by-3.0许可,属于问答任务类别,语言为英文,并标记有向量搜索和检索增强生成。数据集大小小于1,000条记录。
This dataset consists of a small subset of MongoDBs technical documentation, which has been processed into chunked and embedded versions. The fields of the dataset include the source of the document, URL, action taken on the article, content of the article in Markdown format, format of the content, metadata associated with the document such as tags and content type, title of the document, the last updated date, and the embedding of the chunks content created using the thenlpr/gte-small open-source model from Hugging Face. The dataset is useful for prototyping retrieval augmented generation (RAG) applications and is licensed under cc-by-3.0. It falls under the question-answering task category, is in English, and is tagged with vector search and retrieval augmented generation. The size of the dataset is less than 1,000 records.
提供机构:
MongoDB



