five

Vano04/laions-got-talent-enhanced-precomputed-en

收藏
Hugging Face2025-12-09 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Vano04/laions-got-talent-enhanced-precomputed-en
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 language: - en pretty_name: LAION's Got Talent Enhanced Precomputed English --- # LAION's Got Talent Enhanced Precomputed English This dataset contains the precomputed embeddings of the [LAION's got talent enhanced dataset](https://huggingface.co/datasets/laion/laions_got_talent_enhanced_flash_annotations_and_long_captions) english split at 16kHz. The audio was preprocessed with [TuKoResearch/AuriStream100M_RoPE_librilight](https://huggingface.co/TuKoResearch/AuriStream100M_RoPE_librilight) and the text transcriptions were preprocessed with [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m). ``` @inproceedings{tuckute2025cochleartokens, title = {Representing Speech Through Autoregressive Prediction of Cochlear Tokens}, author = {Greta Tuckute and Klemen Kotar and Evelina Fedorenko and Daniel Yamins}, booktitle = {Interspeech 2025}, year = {2025}, pages = {2180--2184}, doi = {10.21437/Interspeech.2025-2044}, issn = {2958-1796} } @article{embedding_gemma_2025, title={EmbeddingGemma: Powerful and Lightweight Text Representations}, author={Schechter Vera, Henrique* and Dua, Sahil* and Zhang, Biao and Salz, Daniel and Mullins, Ryan and Raghuram Panyam, Sindhu and Smoot, Sara and Naim, Iftekhar and Zou, Joe and Chen, Feiyang and Cer, Daniel and Lisak, Alice and Choi, Min and Gonzalez, Lucas and Sanseviero, Omar and Cameron, Glenn and Ballantyne, Ian and Black, Kat and Chen, Kaifeng and Wang, Weiyi and Li, Zhe and Martins, Gus and Lee, Jinhyuk and Sherwood, Mark and Ji, Juyeong and Wu, Renjie and Zheng, Jingxiao and Singh, Jyotinder and Sharma, Abheesht and Sreepat, Divya and Jain, Aashi and Elarabawy, Adham and Co, AJ and Doumanoglou, Andreas and Samari, Babak and Hora, Ben and Potetz, Brian and Kim, Dahun and Alfonseca, Enrique and Moiseev, Fedor and Han, Feng and Palma Gomez, Frank and Hernández Ábrego, Gustavo and Zhang, Hesen and Hui, Hui and Han, Jay and Gill, Karan and Chen, Ke and Chen, Koert and Shanbhogue, Madhuri and Boratko, Michael and Suganthan, Paul and Duddu, Sai Meher Karthik and Mariserla, Sandeep and Ariafar, Setareh and Zhang, Shanfeng and Zhang, Shijie and Baumgartner, Simon and Goenka, Sonam and Qiu, Steve and Dabral, Tanmaya and Walker, Trevor and Rao, Vikram and Khawaja, Waleed and Zhou, Wenlei and Ren, Xiaoqi and Xia, Ye and Chen, Yichang and Chen, Yi-Ting and Dong, Zhe and Ding, Zhongli and Visin, Francesco and Liu, Gaël and Zhang, Jiageng and Kenealy, Kathleen and Casbon, Michelle and Kumar, Ravin and Mesnard, Thomas and Gleicher, Zach and Brick, Cormac and Lacombe, Olivier and Roberts, Adam and Sung, Yunhsuan and Hoffmann, Raphael and Warkentin, Tris and Joulin, Armand and Duerig, Tom and Seyedhosseini, Mojtaba}, publisher={Google DeepMind}, year={2025}, url={https://arxiv.org/abs/2509.20354} } ```
提供机构:
Vano04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作