CohereLabs/fineweb-edu-10B-index
收藏Hugging Face2026-03-25 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/CohereLabs/fineweb-edu-10B-index
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
---
This dataset contains the embeddings for the 10B variant of [fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu), embedded with the Cohere Embed V3 model.
You can search on this dataset with just 500MB of memory using [DiskVectorIndex](https://github.com/cohere-ai/DiskVectorIndex).
# Installation & Usage
Get your free **Cohere API key** from [cohere.com](https://cohere.com). You must set this API key as an environment variable:
```
export COHERE_API_KEY=your_api_key
```
Install the package:
```
pip install DiskVectorIndex
```
You can then search via:
```python
from DiskVectorIndex import DiskVectorIndex
index = DiskVectorIndex("CohereLabs/fineweb-edu-10B-index")
while True:
query = input("\n\nEnter a question: ")
docs = index.search(query, top_k=3)
for doc in docs:
print(doc)
print("=========")
```
# License
Please observe the License for the [Fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu). The license displayed here is just for the embeddings.
提供机构:
CohereLabs



