Intelligent-Internet/wikipedia_en
收藏Hugging Face2025-05-19 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/Intelligent-Internet/wikipedia_en
下载链接
链接失效反馈官方服务:
资源简介:
这是一个经过筛选的英文维基百科页面数据集,直接来源于官方英文维基百科数据库的转储。数据被分成小块并使用Snowflake的snowflake-arctic-embed-m-v2.0嵌入向量。所有的向量嵌入都是16位半精度向量,为cosine索引与vectorchord优化。
This is a curated English Wikipedia pages dataset sourced directly from the official English Wikipedia database dump. The data is chunked into smaller pieces and embedded using Snowflakes snowflake-arctic-embed-m-v2.0. All vector embeddings are 16-bit half-precision vectors optimized for cosine indexing with vectorchord.
提供机构:
Intelligent-Internet



