five

Babbl Labs YouTube Transcription Database | YouTube Video Transcript Firehose | EN/US | 25K+ ...

收藏
Databricks2026-04-09 收录
下载链接:
https://marketplace.databricks.com/details/2bdce24d-9dad-4814-8808-a0961c058be1/Babbl_Babbl-Labs-YouTube-Transcription-Database-YouTube-Video-Transcript-Firehose-EN/US-25K+-
下载链接
链接失效反馈
官方服务:
资源简介:
TranscriptDB is the full YouTube transcript firehose, built for systematic hedge funds with in-house NLP capacity and AI labs training foundation models. Smaller buyers should look at Tripwire instead. What you get: - 25,000+ pre-filtered market-relevant YouTube channels - 1M+ videos processed monthly - Intraday delivery (2-3 hour clip latency) - Named entities, speaker diarization, sentiment scoring on every transcript - 5+ years of historical archive available as a separate backfile Use cases: - Quantitative alpha generation and signal research - Foundation model training and fine-tuning corpora - Long-form NLP backtesting and event studies - Cross-source entity disambiguation at scale
提供机构:
Babbl
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作