high-quality-midjouney-srefs

Name: high-quality-midjouney-srefs
Creator: maas
Published: 2026-01-06 16:39:36
License: 暂无描述

魔搭社区2026-01-06 更新2025-07-26 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/high-quality-midjouney-srefs

下载链接

链接失效反馈

官方服务：

资源简介：

# Midjourney Image Scraper & Dataset Creator A complete toolkit for scraping Midjourney images, generating captions, and creating HuggingFace datasets with optional automatic upload to HuggingFace Hub. ## 🌟 Features - **🔍 Web Scraping**: Download images from midjourneysref.com with comprehensive error handling - **🤖 AI Captioning**: Automatic image captioning using Moondream API with **auto-resume capability** - **✂️ Smart Cropping**: AI-powered image cropping using OpenAI to optimize aspect ratios - **🔄 Interrupt-Safe**: Stop and restart any process - it'll pick up where you left off - **📦 HuggingFace Datasets**: Create professional HF-compatible datasets with metadata - **☁️ Hub Upload**: Direct upload to HuggingFace Hub with one command - **📊 Detailed Logging**: Comprehensive feedback and statistics for all operations ## 🚀 Quick Start ### 1. Installation ```bash # Clone or download this repository git clone <your-repo-url> cd style_scraper # Install all dependencies pip install -r requirements.txt ``` ### 2. Environment Setup Create a `.env` file in the project root: ```bash # Required for image captioning MOONDREAM_API_KEY=your_moondream_api_key_here # Required for AI-powered cropping OPENAI_API_KEY=your_openai_api_key_here # Required for HuggingFace Hub upload HF_TOKEN=your_huggingface_token_here ``` **Get your tokens:** - **Moondream API**: https://moondream.ai/api - **OpenAI API**: https://platform.openai.com/api-keys - **HuggingFace Hub**: https://huggingface.co/settings/tokens (select "Write" permissions) ### 3. Basic Usage ```bash # 1. Scrape images python scraper.py # 2. Generate captions (optional but recommended) python caption_images.py # 3. Crop images with AI analysis (optional) python crop_dataset.py # 4. Create HuggingFace dataset python create_hf_dataset.py --prompts prompts.csv # 5. Or create and upload in one step python create_hf_dataset.py --prompts prompts.csv --upload --repo-id your-username/my-dataset ``` ## 📖 Detailed Usage ### Step 1: Image Scraping (`scraper.py`) Downloads images from midjourneysref.com with comprehensive error handling. ```bash python scraper.py ``` **Features:** - Downloads images from pages 6-20 by default (configurable in script) - Pre-flight connectivity testing - Duplicate detection and skipping - Detailed error categorization and statistics - Comprehensive logging to `scraper.log` **Output:** Images saved to `midjourney_images/` folder ### Step 2: Image Captioning (`caption_images.py`) Generates captions for uncaptioned images using the Moondream API. **Automatically resumes** from where you left off! ```bash # Basic usage - automatically resumes from existing captions python caption_images.py # First run captions 50 images, then gets interrupted... # Second run automatically continues with remaining images python caption_images.py # No extra flags needed! # Force recaption all images (ignoring existing captions) python caption_images.py --recaption # Custom options python caption_images.py --images my_images/ --output captions.csv --delay 0.5 # Caption length control python caption_images.py --prompt-length short # Concise captions python caption_images.py --prompt-length normal # Detailed captions python caption_images.py --prompt-length mix # Random mix (default) ``` **Options:** - `--images, -i`: Input folder (default: `midjourney_images`) - `--output, -o`: Output CSV file (default: `prompts.csv`) - `--existing, -e`: Custom existing prompts file (optional, by default uses output file) - `--recaption, -r`: Force recaption all images, ignoring existing captions - `--prompt-length, -l`: Caption length - `short`, `normal`, or `mix` (default: `mix`) - `--delay, -d`: API rate limiting delay (default: 1.0 seconds) - `--verbose, -v`: Enable debug logging **Features:** - **🔄 Auto-resume**: Automatically skips already-captioned images by default - **🚀 Interrupt-safe**: Can safely stop and restart the process anytime - Rate limiting to respect API limits - API connection testing before processing - Comprehensive error handling and statistics - Compatible CSV output for dataset creation **Output:** CSV file with `filename,prompt` columns ### Step 3: AI-Powered Image Cropping (`crop_dataset.py`) Analyzes images using OpenAI's vision API to determine optimal crop ratios and creates cropped versions. ```bash # Basic usage - processes all images in prompts.csv python crop_dataset.py # Test with a single image first python test_crop.py ``` **Features:** - **🤖 AI Analysis**: Uses OpenAI GPT-4 Vision to analyze each image for optimal cropping - **📐 Smart Ratios**: Supports 16:9, 9:16, 4:3, 3:4, 1:1, or no cropping - **🎯 Content-Aware**: Considers image content, composition, and aesthetics - **📊 Metadata Tracking**: Saves crop ratios and dimensions to updated CSV - **🔒 Fallback Safe**: Defaults to 1:1 (square) for invalid responses or errors **Crop Ratios:** - `16:9` - Wide scenes, landscapes, group shots - `9:16` - Tall subjects, full-body portraits, buildings - `4:3` - Balanced framing, single subjects, general scenes - `3:4` - Portraits, vertical objects, close-ups - `1:1` - Square format, where entire image is important - `no` - Keep original aspect ratio (treated as 1:1) **Output:** - `cropped_images/` - Folder with all cropped images - `prompts_with_crops.csv` - Updated CSV with crop metadata - `crop_dataset.log` - Detailed processing log **CSV Columns Added:** - `crop_ratio` - AI-recommended aspect ratio - `cropped_width` - Width of cropped image in pixels - `cropped_height` - Height of cropped image in pixels - `cropped_filename` - Filename of the cropped version ### Step 4: Dataset Creation (`create_hf_dataset.py`) Creates a professional HuggingFace-compatible dataset with metadata extraction. ```bash # Local dataset only python create_hf_dataset.py --prompts prompts.csv # Using cropped images and metadata python create_hf_dataset.py --prompts prompts_with_crops.csv --input cropped_images # With HuggingFace Hub upload python create_hf_dataset.py --prompts prompts.csv --upload --repo-id username/dataset-name ``` **Options:** - `--input, -i`: Input images folder (default: `midjourney_images`) - `--output, -o`: Output dataset folder (default: `midjourney_hf_dataset`) - `--name, -n`: Dataset name (default: `midjourney-images`) - `--prompts, -p`: Path to prompts CSV file - `--upload, -u`: Upload to HuggingFace Hub - `--repo-id, -r`: HuggingFace repository ID (e.g., `username/dataset-name`) - `--verbose, -v`: Enable debug logging **Features:** - Extracts comprehensive image metadata (resolution, file size, orientation, etc.) - Style reference (`sref`) extraction from filenames - HuggingFace-compatible structure and configuration - Automatic README generation with usage examples - Dataset loading scripts for seamless integration - Optional direct upload to HuggingFace Hub ## 📊 Output Structure ### Dataset Directory Structure ``` midjourney_hf_dataset/ ├── images/ # All image files ├── metadata/ │ ├── metadata.csv # Main HuggingFace metadata │ ├── metadata_detailed.json # Detailed metadata │ └── dataset_summary.json # Dataset statistics ├── dataset_config/ # HuggingFace configuration │ ├── dataset_infos.json │ └── midjourney-images.py # Dataset loading script └── README.md # Generated documentation ``` ### Metadata Fields Each image includes comprehensive metadata: - `filename`: Original image filename - `file_path`: Relative path within dataset - `sref`: Style reference ID (extracted from filename) - `prompt`: AI-generated or provided caption - `width`: Image width in pixels - `height`: Image height in pixels - `file_size_mb`: File size in megabytes - `size_category`: Resolution category (high/medium/low) - `orientation`: Image orientation (landscape/portrait/square) ## 🔧 Configuration ### Scraper Configuration Edit variables in `scraper.py`: ```python # Folder where you want to save the images DOWNLOAD_FOLDER = "midjourney_images" # Page range to scrape start_page = 6 end_page = 20 ``` ### Caption Configuration The captioning script supports rate limiting, automatic resuming, and caption length control: ```bash # Caption length options python caption_images.py --prompt-length short # Short, concise captions python caption_images.py --prompt-length normal # Detailed descriptions python caption_images.py --prompt-length mix # Random mix (default - adds variety) # Rate limiting for API stability python caption_images.py --delay 1.0 # Auto-resume from previous run (default behavior) python caption_images.py # Force recaption everything from scratch python caption_images.py --recaption # Use a different existing file for comparison python caption_images.py --existing other_captions.csv ``` ### Dataset Configuration Customize dataset creation: ```bash # Custom dataset name and structure python create_hf_dataset.py \ --input my_images/ \ --output my_custom_dataset/ \ --name "my-custom-midjourney-dataset" \ --prompts captions.csv ``` ## 🤗 Using Your Dataset ### Loading from HuggingFace Hub ```python from datasets import load_dataset # Load your uploaded dataset dataset = load_dataset("username/your-dataset-name") # Access data for example in dataset["train"]: image = example["image"] # PIL Image prompt = example["prompt"] # Caption text sref = example["sref"] # Style reference width = example["width"] # Image width # ... other metadata ``` ### Loading Locally ```python import pandas as pd from PIL import Image import os # Load metadata metadata = pd.read_csv("midjourney_hf_dataset/metadata/metadata.csv") # Load specific image def load_image(filename): return Image.open(f"midjourney_hf_dataset/images/{filename}") # Filter by criteria high_res = metadata[metadata['size_category'] == 'high_resolution'] has_prompts = metadata[metadata['prompt'] != ""] same_style = metadata[metadata['sref'] == '4160600070'] ``` ## 📈 Advanced Usage ### Batch Processing ```bash # Process multiple scraping sessions - auto-resumes captioning! for i in {1..5}; do python scraper.py python caption_images.py # Automatically skips existing captions done # Create final dataset python create_hf_dataset.py --prompts prompts.csv --upload --repo-id username/large-midjourney-dataset # Alternative: Force fresh captions for each batch for i in {1..5}; do python scraper.py python caption_images.py --recaption done ``` ### Custom Prompts You can provide your own prompts instead of using AI captioning: ```csv filename,prompt 4160600070-1-d9409ee5.png,"A majestic dragon soaring over snow-capped mountains" 4160600070-2-a8b7c9d2.png,"Cyberpunk cityscape with neon reflections in rain" ``` ### Style-Based Datasets ```bash # Filter by style reference before creating dataset python -c " import pandas as pd df = pd.read_csv('prompts.csv') style_df = df[df['filename'].str.startswith('4160600070')] style_df.to_csv('style_specific_prompts.csv', index=False) " python create_hf_dataset.py --prompts style_specific_prompts.csv --name "style-4160600070" ``` ## 🐛 Troubleshooting ### Common Issues **1. Missing API Keys** ``` Error: MOONDREAM_API_KEY not found ``` - Ensure `.env` file exists with valid API key - Check API key has sufficient credits **2. HuggingFace Upload Fails** ``` Error: HF_TOKEN not found ``` - Create token at https://huggingface.co/settings/tokens - Ensure "Write" permissions are selected - Check repository name is available **3. No Images Found** ``` Warning: No images found with primary selector ``` - Website structure may have changed - Check internet connection - Verify target pages exist **4. Caption Generation Fails** ``` Failed to caption image: API error ``` - Check Moondream API status - Verify API key and credits - Reduce rate limiting with `--delay` - The script auto-resumes, so you can safely restart after fixing the issue **5. Want to Recaption Existing Images** ``` Images already have captions but I want to regenerate them ``` - Use `--recaption` flag to ignore existing captions - Or delete the existing CSV file to start fresh ### Log Files Check these log files for detailed debugging: - `scraper.log`: Web scraping logs - `caption_images.log`: Captioning process logs - `hf_dataset_creation.log`: Dataset creation logs ## 📄 License This project is for educational and research purposes. Please respect the terms of service of the source website and API providers. ## 🤝 Contributing Feel free to submit issues and enhancement requests! ## 🙏 Acknowledgments - Images sourced from [midjourneysref.com](https://midjourneysref.com) - Captioning powered by [Moondream API](https://moondream.ai) - Dataset hosting by [🤗 Hugging Face](https://huggingface.co)

# Midjourney 图像抓取器与数据集创建工具（Midjourney Image Scraper & Dataset Creator）一款用于抓取Midjourney图像、生成图像标题以及创建HuggingFace数据集的完整工具包，支持可选的自动上传至HuggingFace Hub。 ## 🌟 核心特性 - **🔍 网页抓取**：从midjourneysref.com下载图像，附带全面的错误处理机制 - **🤖 AI 图像标题生成**：使用Moondream API自动生成图像标题，支持**自动续跑功能** - **✂️ 智能裁剪**：依托OpenAI的AI能力进行图像裁剪，优化画幅比例 - **🔄 断点续跑**：可随时停止或重启进程，程序会从上次中断处自动恢复执行 - **📦 HuggingFace 数据集**：创建符合专业标准的HF兼容数据集并附带元数据 - **☁️ Hub 上传**：单命令直接上传至HuggingFace Hub - **📊 详细日志**：为所有操作提供全面的反馈与统计信息 ## 🚀 快速上手 ### 1. 安装部署 bash # 克隆或下载本仓库 git clone <your-repo-url> cd style_scraper # 安装所有依赖项 pip install -r requirements.txt ### 2. 环境配置在项目根目录创建 `.env` 文件： bash # 图像标题生成所需 MOONDREAM_API_KEY=your_moondream_api_key_here # AI 图像裁剪所需 OPENAI_API_KEY=your_openai_api_key_here # HuggingFace Hub 上传所需 HF_TOKEN=your_huggingface_token_here **获取对应令牌：** - **Moondream API**：https://moondream.ai/api - **OpenAI API**：https://platform.openai.com/api-keys - **HuggingFace Hub**：https://huggingface.co/settings/tokens（需选择“写入”权限） ### 3. 基础用法 bash # 1. 抓取图像 python scraper.py # 2. 生成图像标题（可选但推荐） python caption_images.py # 3. 使用AI分析裁剪图像（可选） python crop_dataset.py # 4. 创建HuggingFace数据集 python create_hf_dataset.py --prompts prompts.csv # 5. 或一步完成创建与上传 python create_hf_dataset.py --prompts prompts.csv --upload --repo-id your-username/my-dataset ## 📖 详细使用指南 ### 步骤1：图像抓取（`scraper.py`）从midjourneysref.com下载图像，附带全面的错误处理机制。 bash python scraper.py **功能特性：** - 默认抓取第6至20页的内容（可在脚本中自定义配置） - 预检连接测试 - 重复图像检测与跳过 - 详细的错误分类与统计信息 - 日志文件写入至 `scraper.log` **输出：** 图像保存至 `midjourney_images/` 文件夹 ### 步骤2：图像标题生成（`caption_images.py`）使用Moondream API为未标注标题的图像自动生成标题，**支持自动续跑**，从上次中断处恢复。 bash # 基础用法 - 自动从已有标题处续跑 python caption_images.py # 首次运行生成50张图像的标题后中断... # 第二次运行将自动继续处理剩余图像 python caption_images.py # 无需额外参数！ # 强制重新为所有图像生成标题（忽略已有标题） python caption_images.py --recaption # 自定义选项 python caption_images.py --images my_images/ --output captions.csv --delay 0.5 # 标题长度控制 python caption_images.py --prompt-length short # 简短精炼的标题 python caption_images.py --prompt-length normal # 详细描述性标题 python caption_images.py --prompt-length mix # 随机混合模式（默认） **可选参数：** - `--images, -i`：输入图像文件夹（默认：`midjourney_images`） - `--output, -o`：输出CSV文件路径（默认：`prompts.csv`） - `--existing, -e`：自定义已有标题文件（可选，默认使用输出文件） - `--recaption, -r`：强制重新生成所有图像标题，忽略已有内容 - `--prompt-length, -l`：标题长度模式 - `short`、`normal` 或 `mix`（默认：`mix`） - `--delay, -d`：API请求速率限制延迟（默认：1.0秒） - `--verbose, -v`：启用调试日志 **功能特性：** - **🔄 自动续跑**：默认自动跳过已生成标题的图像 - **🚀 断点安全**：可随时安全停止或重启进程 - 速率限制以遵守API调用限制 - 处理前的API连接测试 - 全面的错误处理与统计信息 - 兼容数据集创建所需的CSV输出格式 **输出：** 包含`filename,prompt`两列的CSV文件 ### 步骤3：AI驱动的图像裁剪（`crop_dataset.py`）使用OpenAI视觉API分析图像，确定最优裁剪比例并生成裁剪后的图像版本。 bash # 基础用法 - 处理prompts.csv中的所有图像 python crop_dataset.py # 先单张图像测试裁剪效果 python test_crop.py **功能特性：** - **🤖 AI 分析**：使用OpenAI GPT-4 Vision分析每张图像，确定最优裁剪方案 - **📐 智能比例**：支持16:9、9:16、4:3、3:4、1:1或不裁剪模式 - **🎯 内容感知**：考虑图像内容、构图与美学效果 - **📊 元数据跟踪**：将裁剪比例与尺寸信息保存至更新后的CSV文件 - **🔒 安全 fallback**：若API返回无效结果或发生错误，默认使用1:1（方形）裁剪 **支持的裁剪比例：** - `16:9` - 宽幅场景、风景、群组合影 - `9:16` - 竖幅主体、全身人像、建筑拍摄 - `4:3` - 均衡构图、单一主体、通用场景 - `3:4` - 人像、垂直物体、特写拍摄 - `1:1` - 方形画幅，需保留完整图像内容 - `no` - 保留原始画幅比例（视为1:1处理） **输出：** - `cropped_images/` - 存放所有裁剪后图像的文件夹 - `prompts_with_crops.csv` - 包含裁剪元数据的更新版CSV文件 - `crop_dataset.log` - 详细的处理日志 **新增CSV列：** - `crop_ratio` - AI推荐的画幅比例 - `cropped_width` - 裁剪后图像的像素宽度 - `cropped_height` - 裁剪后图像的像素高度 - `cropped_filename` - 裁剪后图像的文件名 ### 步骤4：数据集创建（`create_hf_dataset.py`）创建符合专业标准的HuggingFace兼容数据集并提取元数据。 bash # 仅生成本地数据集 python create_hf_dataset.py --prompts prompts.csv # 使用裁剪后的图像与元数据 python create_hf_dataset.py --prompts prompts_with_crops.csv --input cropped_images # 直接上传至HuggingFace Hub python create_hf_dataset.py --prompts prompts.csv --upload --repo-id username/dataset-name **可选参数：** - `--input, -i`：输入图像文件夹（默认：`midjourney_images`） - `--output, -o`：输出数据集文件夹（默认：`midjourney_hf_dataset`） - `--name, -n`：数据集名称（默认：`midjourney-images`） - `--prompts, -p`：标题CSV文件路径 - `--upload, -u`：上传至HuggingFace Hub - `--repo-id, -r`：HuggingFace仓库ID（例如：`username/dataset-name`） - `--verbose, -v`：启用调试日志 **功能特性：** - 提取全面的图像元数据（分辨率、文件大小、方向等） - 从文件名提取风格参考ID（sref） - 符合HuggingFace规范的目录结构与配置 - 自动生成包含使用示例的README文档 - 数据集加载脚本，实现无缝集成 - 可选直接上传至HuggingFace Hub ## 📊 输出目录结构 ### 数据集目录结构 midjourney_hf_dataset/ ├── images/ # 所有图像文件 ├── metadata/ │ ├── metadata.csv # HuggingFace主元数据文件 │ ├── metadata_detailed.json # 详细元数据 │ └── dataset_summary.json # 数据集统计信息 ├── dataset_config/ # HuggingFace配置文件 │ ├── dataset_infos.json │ └── midjourney-images.py # 数据集加载脚本 └── README.md # 自动生成的文档 ### 元数据字段每张图像包含以下全面元数据： - `filename`：原始图像文件名 - `file_path`：数据集内的相对路径 - `sref`：风格参考ID（从文件名提取） - `prompt`：AI生成或提供的图像标题 - `width`：图像像素宽度 - `height`：图像像素高度 - `file_size_mb`：文件大小（单位：MB） - `size_category`：分辨率类别（高/中/低） - `orientation`：图像方向（横向/纵向/方形） ## 🔧 配置调整 ### 抓取脚本配置编辑`scraper.py`中的变量： python # 图像保存文件夹 DOWNLOAD_FOLDER = "midjourney_images" # 抓取的页码范围 start_page = 6 end_page = 20 ### 标题生成脚本配置标题生成脚本支持速率限制、自动续跑与标题长度控制： bash # 标题长度选项 python caption_images.py --prompt-length short # 简短标题 python caption_images.py --prompt-length normal # 详细描述标题 python caption_images.py --prompt-length mix # 随机混合模式（默认，增加多样性） # 调整API请求速率以保证稳定性 python caption_images.py --delay 1.0 # 从上次运行处自动续跑（默认行为） python caption_images.py # 强制从头开始重新生成所有标题 python caption_images.py --recaption # 使用其他已有标题文件进行比对 python caption_images.py --existing other_captions.csv ### 数据集创建脚本配置自定义数据集创建参数： bash # 自定义数据集名称与结构 python create_hf_dataset.py --input my_images/ --output my_custom_dataset/ --name "my-custom-midjourney-dataset" --prompts captions.csv ## 🤗 使用生成的数据集 ### 从HuggingFace Hub加载 python from datasets import load_dataset # 加载您上传的数据集 dataset = load_dataset("username/your-dataset-name") # 访问数据 for example in dataset["train"]: image = example["image"] # PIL Image图像对象 prompt = example["prompt"] # 图像标题文本 sref = example["sref"] # 风格参考ID width = example["width"] # 图像宽度 # ... 其他元数据字段 ### 本地加载数据集 python import pandas as pd from PIL import Image import os # 加载元数据 metadata = pd.read_csv("midjourney_hf_dataset/metadata/metadata.csv") # 加载指定图像 def load_image(filename): return Image.open(f"midjourney_hf_dataset/images/{filename}") # 按条件筛选数据 high_res = metadata[metadata['size_category'] == 'high_resolution'] has_prompts = metadata[metadata['prompt'] != ""] same_style = metadata[metadata['sref'] == '4160600070'] ## 📈 高级用法 ### 批量处理 bash # 多次抓取会话 - 标题生成自动续跑！ for i in {1..5}; do python scraper.py python caption_images.py # 自动跳过已生成标题的图像 done # 创建最终数据集 python create_hf_dataset.py --prompts prompts.csv --upload --repo-id username/large-midjourney-dataset # 替代方案：为每个批处理重新生成标题 for i in {1..5}; do python scraper.py python caption_images.py --recaption done ### 自定义标题您可以提供自定义标题而非使用AI生成： csv filename,prompt 4160600070-1-d9409ee5.png,"A majestic dragon soaring over snow-capped mountains" 4160600070-2-a8b7c9d2.png,"Cyberpunk cityscape with neon reflections in rain" ### 基于风格的专属数据集 bash # 在创建数据集前按风格参考ID筛选数据 python -c " import pandas as pd df = pd.read_csv('prompts.csv') style_df = df[df['filename'].str.startswith('4160600070')] style_df.to_csv('style_specific_prompts.csv', index=False) " python create_hf_dataset.py --prompts style_specific_prompts.csv --name "style-4160600070" ## 🐛 故障排除 ### 常见问题 **1. 缺少API密钥** Error: MOONDREAM_API_KEY not found - 确保`.env`文件存在且包含有效的API密钥 - 检查API密钥是否有足够的调用额度 **2. HuggingFace上传失败** Error: HF_TOKEN not found - 在https://huggingface.co/settings/tokens创建令牌 - 确保选择了“写入”权限 - 检查仓库名称是否可用 **3. 未找到任何图像** Warning: No images found with primary selector - 目标网站的结构可能已变更 - 检查网络连接 - 确认目标页面是否存在 **4. 标题生成失败** Failed to caption image: API error - 检查Moondream API的运行状态 - 确认API密钥与调用额度是否有效 - 使用`--delay`参数降低请求速率 - 脚本支持自动续跑，修复问题后可安全重启 **5. 需要重新生成已有图像的标题** Images already have captions but I want to regenerate them - 使用`--recaption`参数忽略已有标题重新生成 - 或删除现有CSV文件从头开始 ### 日志文件查看以下日志文件进行详细调试： - `scraper.log`：网页抓取日志 - `caption_images.log`：标题生成进程日志 - `hf_dataset_creation.log`：数据集创建日志 ## 📄 许可证本项目仅用于教育与研究用途，请遵守源网站与API服务提供商的服务条款。 ## 🤝 贡献欢迎提交问题报告与功能改进请求！ ## 🙏 致谢 - 图像来源：[midjourneysref.com](https://midjourneysref.com) - 标题生成服务：[Moondream API](https://moondream.ai) - 数据集托管：[🤗 Hugging Face](https://huggingface.co)

提供机构：

maas

创建时间：

2025-07-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集