JosJenn/nasa-artemis-ii-live-chat-comments
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/JosJenn/nasa-artemis-ii-live-chat-comments
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
language:
- en
annotations_creators:
- no-annotation
language_creators:
- found
multilinguality:
- monolingual
size_categories:
- 100K<n<1M
source_datasets:
- original
task_categories:
- sequence-modeling
task_ids:
- language-modeling
---
# NASA Artemis II Live Chat Dataset
Public chat messages from the NASA Artemis II lunar flyby broadcast on April 6, 2026, collected simultaneously from YouTube and Twitch during the first crewed lunar mission since Apollo 17.
## Dataset Composition
| Platform | Authors | Messages | Broadcast Duration |
|----------|---------|----------|-------------------|
| YouTube | 118,872 | 148,712 | ~4.5 hours |
| Twitch | 39,364 | 93,782 | ~4.5 hours |
| **Total** | **158,236** | **242,494** | — |
## Dataset Structure
This is a **unified** dataset containing all comments from both platforms. Use the `platform` column to filter by source.
## Data Schema
| Field | Type | Description |
|-------|------|-------------|
| `id` | integer | Unique message identifier |
| `platform_msg_id` | string | Platform-specific message ID |
| `channel_id` | string | Anonymized channel identifier |
| `author_id` | string | Anonymized author identifier |
| `published_at` | string | UTC timestamp (ISO8601) |
| `msg_type` | string | Message type classification |
| `message` | string | Raw message text (multilingual, emojis) |
| `inserted_at` | string | When the message was recorded |
| `platform` | string | Source platform (`youtube` or `twitch`) |
| `category` | string | Message category (YouTube only, null for Twitch) |
| `tags` | string | Additional classification tags (YouTube only, null for Twitch) |
| `sentiment` | string | Sentiment label (YouTube only, null for Twitch) |
## Anonymization
All usernames and channel identifiers have been anonymized using SHA-256 hashing with a dataset-specific salt:
- **One-way transformation**: Original identifiers cannot be reconstructed
- **Consistent mapping**: Same user gets the same anonymous ID across all messages
- **Content preserved**: All message text and metadata remain intact
## Research Applications
- **Language model training** — Real-world, diverse chat interactions
- **Sentiment analysis research** — Cross-platform comparison during live events
- **Social dynamics study** — Audience response to space mission milestones
- **Community behavior research** — Online engagement patterns during historic events
## Dataset Source
- **Source**: NASA Artemis II official broadcast
- **Date**: April 6, 2026
- **Method**: Real-time scraping from public YouTube and Twitch livestreams
- **Coverage**: Full broadcast duration (~20:00 UTC April 6 – 00:30 UTC April 7)
## License
CC BY 4.0 — Attribution required when using this data.
## Citation
```
NASA Artemis II Live Chat Dataset
Collected from YouTube and Twitch live broadcasts
April 6, 2026
Anonymized for public release
```
提供机构:
JosJenn



