Boinko/YouTubeVideoMetadata
收藏Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Boinko/YouTubeVideoMetadata
下载链接
链接失效反馈官方服务:
资源简介:
---
license: pddl
---
Metadata from the [Scrape Exchange](https://scrape.exchange/) for:
- 40k YouTube channels, including counts for subscribers, views and videos and merch, courses, posts, playlists. JSONSchema for the YouTube channels is on [Github](https://github.com/ScrapeExchange/scrape-python/blob/main/tests/collateral/boinko-youtube-channel-schema.json)
- 9m YouTube videos, including views, likes, formats, thumbnail URLs, captions. JSONSchema for these YouTube videos is on [Github](https://github.com/ScrapeExchange/scrape-python/blob/main/tests/collateral/boinko-youtube-video-schema.json)
This is just the metadata; no actual videos, images, etc. are included. This dataset consists of a single tar file that contains Brotli-compressed JSON files. The directory structure is
- \<version\>: the version of directory layout, currently always v1.
- \<schema\>: The schema that the data under under this directory complies with
- \<uploader\>: The username of the person that uploaded the data
- \<creator_id\>: For YouTube, this is the handle of the YouTube channel
- \<content_id\>.json.br: The compressed JSON file containing the data for the video or channel.
Each JSON file consists of one object containing envelope data and an item 'data' that stores the actual data about the channel or video as described in the JSONSchema.
The latest updates to channels and videos are available on [Scrape.Exchange](https://scrape.exchange/browse)
If you want to contribute to this dataset, signup for your forever-free [Scrape.Exchange](https://scrape.exchange/) account to get your API keys, clone our [Github repo with YouTube scrapers](https://github.com/ScrapeExchange/scrape-python) and, configure the tools through the .env file and run them as described in the [README,d](https://github.com/ScrapeExchange/scrape-python/blob/main/README.md) of the repo.
提供机构:
Boinko



