Vano04/laions-got-talent-enhanced-precomputed-en
收藏Hugging Face2025-12-09 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Vano04/laions-got-talent-enhanced-precomputed-en
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
pretty_name: LAION's Got Talent Enhanced Precomputed English
---
# LAION's Got Talent Enhanced Precomputed English
This dataset contains the precomputed embeddings of the [LAION's got talent enhanced dataset](https://huggingface.co/datasets/laion/laions_got_talent_enhanced_flash_annotations_and_long_captions) english split at 16kHz.
The audio was preprocessed with [TuKoResearch/AuriStream100M_RoPE_librilight](https://huggingface.co/TuKoResearch/AuriStream100M_RoPE_librilight) and the text transcriptions were preprocessed with [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m).
```
@inproceedings{tuckute2025cochleartokens,
title = {Representing Speech Through Autoregressive Prediction of Cochlear Tokens},
author = {Greta Tuckute and Klemen Kotar and Evelina Fedorenko and Daniel Yamins},
booktitle = {Interspeech 2025},
year = {2025},
pages = {2180--2184},
doi = {10.21437/Interspeech.2025-2044},
issn = {2958-1796}
}
@article{embedding_gemma_2025,
title={EmbeddingGemma: Powerful and Lightweight Text Representations},
author={Schechter Vera, Henrique* and Dua, Sahil* and Zhang, Biao and Salz, Daniel and Mullins, Ryan and Raghuram Panyam, Sindhu and Smoot, Sara and Naim, Iftekhar and Zou, Joe and Chen, Feiyang and Cer, Daniel and Lisak, Alice and Choi, Min and Gonzalez, Lucas and Sanseviero, Omar and Cameron, Glenn and Ballantyne, Ian and Black, Kat and Chen, Kaifeng and Wang, Weiyi and Li, Zhe and Martins, Gus and Lee, Jinhyuk and Sherwood, Mark and Ji, Juyeong and Wu, Renjie and Zheng, Jingxiao and Singh, Jyotinder and Sharma, Abheesht and Sreepat, Divya and Jain, Aashi and Elarabawy, Adham and Co, AJ and Doumanoglou, Andreas and Samari, Babak and Hora, Ben and Potetz, Brian and Kim, Dahun and Alfonseca, Enrique and Moiseev, Fedor and Han, Feng and Palma Gomez, Frank and Hernández Ábrego, Gustavo and Zhang, Hesen and Hui, Hui and Han, Jay and Gill, Karan and Chen, Ke and Chen, Koert and Shanbhogue, Madhuri and Boratko, Michael and Suganthan, Paul and Duddu, Sai Meher Karthik and Mariserla, Sandeep and Ariafar, Setareh and Zhang, Shanfeng and Zhang, Shijie and Baumgartner, Simon and Goenka, Sonam and Qiu, Steve and Dabral, Tanmaya and Walker, Trevor and Rao, Vikram and Khawaja, Waleed and Zhou, Wenlei and Ren, Xiaoqi and Xia, Ye and Chen, Yichang and Chen, Yi-Ting and Dong, Zhe and Ding, Zhongli and Visin, Francesco and Liu, Gaël and Zhang, Jiageng and Kenealy, Kathleen and Casbon, Michelle and Kumar, Ravin and Mesnard, Thomas and Gleicher, Zach and Brick, Cormac and Lacombe, Olivier and Roberts, Adam and Sung, Yunhsuan and Hoffmann, Raphael and Warkentin, Tris and Joulin, Armand and Duerig, Tom and Seyedhosseini, Mojtaba},
publisher={Google DeepMind},
year={2025},
url={https://arxiv.org/abs/2509.20354}
}
```
提供机构:
Vano04



