suriyagunasekar/midjourney-text-prompts
收藏Hugging Face2026-03-31 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/suriyagunasekar/midjourney-text-prompts
下载链接
链接失效反馈官方服务:
资源简介:
# Midjourney Text Prompts Dataset
This dataset contains extracted and cleaned Midjourney text prompts from Discord message JSON files.
## Dataset Description
- **Source**: Midjourney Discord server message data
- **Task**: Text-to-image prompt extraction
- **Message Types**: Filtered for INITIAL_OR_VARIATION (type 0) and UPSCALE (type 19) messages
## Data Processing
1. Extracted prompts from messages with double-star formatting (`**prompt**`)
2. Removed embedded image URLs (e.g., `https://s.mj.run/xxx`)
3. Removed duplicate and invalid prompts
4. Split into train/validation/test sets
## Dataset Structure
- `train.jsonl`: 53,610 prompts (80%)
- `validation.jsonl`: 6,701 prompts (10%)
- `test.jsonl`: 6,702 prompts (10%)
## Data Format
Each line is a JSON object with a `prompt` field:
```json
{"prompt": "a beautiful sunset over the ocean, photorealistic, dramatic lighting --ar 16:9"}
```
## Use Cases
This dataset is suitable for:
- Language model fine-tuning on text-to-image prompts
- Prompt generation training
- AI art prompt analysis
## Statistics
- Total unique prompts: 67,013
- Average prompt length: ~119 characters
- Median prompt length: 86 characters
提供机构:
suriyagunasekar



