five

reglab/glove-v

收藏
Hugging Face2025-12-21 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/reglab/glove-v
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含预计算的GloVe嵌入和GloVe-V方差,适用于自然语言任务和计算社会科学分析。数据集包括多个语料库的嵌入和方差,如Toy Corpus和Corpus of Historical American English (COHA)。数据集提供了完整方差和近似方差两种版本,近似方差通过对角近似或低秩奇异值分解(SVD)近似来减少存储需求。数据集由Andrea Vallebueno等人创建,语言为英语,许可证根据语料库的不同而有所变化。

The GloVe-V dataset contains pre-computed GloVe embeddings and GloVe-V variances for multiple corpora, used in natural language tasks and computational social science analysis. The dataset includes the Toy Corpus (300-dim) and the Corpus of Historical American English (1900-1999, 300-dim). The dataset offers two versions of variances: approximated variances and complete variances, the former reducing storage needs through approximations of the full variance, and the latter being the full GloVe-V variances. The creators of the dataset are Andrea Vallebueno, Cassandra Handan-Nader, Christopher D. Manning, and Daniel E. Ho, the language is English, and the license varies according to each corpus.
提供机构:
reglab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作