five

EliGenTrainSet

收藏
魔搭社区2026-05-16 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet
下载链接
链接失效反馈
官方服务:
资源简介:
## Train Dataset of EliGen * Paper: [EliGen: Entity-Level Controlled Image Generation with Regional Attention](https://arxiv.org/abs/2501.01097) * Github: [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) * Model: [ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen) * Online Demo: [ModelScope EliGen Studio](https://www.modelscope.cn/studios/DiffSynth-Studio/EliGen) ## Dataset Description All image data is stored in parquet format. Each sample contains the following fields: * image_id: unique id for each image, `000009` * caption: global prompt, eg. `A cartoon chicken dressed in a suit and tie.` * entities: entity descriptions and their bounding boxes, eg. `[{"entity": "cartoon chicken", "bbox": [0.145, 0.06, 0.854, 0.94]}]` * image: base64 encoded image data. For better usage, we provide a python script to read the parquet files and extract the source image and annotations. It is recommended to decode and restore the images first. The annotations is also stored in json format in `caption-bboxbyqwen-dataset.jsonl`, you may use it for lightweight loading of the annotations. ## Example Usage download dataset using ```bash git lfs install git clone https://www.modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet.git ``` or ```bash modelscope download --dataset DiffSynth-Studio/EliGenTrainSet ``` Then you can use the following python code to read the parquet files to extract the source image and annotations. ```python import pandas as pd import base64 from PIL import Image import io import os parquet_file_template = 'output_parquet_files/part-{}.parquet' out_image_folder = 'output_images' os.makedirs(out_image_folder, exist_ok=True) for file_num in range(100): file_name = f"{file_num:05d}" parquet_file = parquet_file_template.format(file_name) if not os.path.exists(parquet_file): print(f"File {parquet_file} does not exist, skipping.") continue df = pd.read_parquet(parquet_file) for i, row in df.iterrows(): try: # global prompt caption = row['caption'] # entity descriptions and their bounding boxes: [x1, y1, x2, y2] all_entities = row.get("entities", []) # image id image_id = row['image_id'] # image data image_data = base64.b64decode(row['image']) image = Image.open(io.BytesIO(image_data)) image.save(os.path.join(out_image_folder, f"{image_id}.png")) except: continue print(f"File {parquet_file} processed.") ``` ## Citation If you find our work helpful, feel free to give us a cite. ``` @article{zhang2025eligen, title={Eligen: Entity-level controlled image generation with regional attention}, author={Zhang, Hong and Duan, Zhongjie and Wang, Xingjun and Chen, Yingda and Zhang, Yu}, journal={arXiv preprint arXiv:2501.01097}, year={2025} } ```

# EliGen训练数据集 * 论文:[EliGen:基于实体级控制的区域注意力图像生成(EliGen: Entity-Level Controlled Image Generation with Regional Attention)](https://arxiv.org/abs/2501.01097) * GitHub仓库:[DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) * 模型:[ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen) * 在线演示:[ModelScope EliGen Studio](https://www.modelscope.cn/studios/DiffSynth-Studio/EliGen) ## 数据集描述 所有图像数据均以Parquet(列式存储数据格式)格式存储。每个样本包含以下字段: * `image_id`:每张图像的唯一标识符,示例为`000009` * `caption`:全局提示文本,示例为`身着西装领带的卡通小鸡`(原英文为`A cartoon chicken dressed in a suit and tie.`) * `entities`:实体描述及其边界框,示例为`[{"entity": "cartoon chicken", "bbox": [0.145, 0.06, 0.854, 0.94]}]` * `image`:Base64编码的图像数据。 为便于使用,我们提供了Python脚本用于读取Parquet文件并提取源图像与标注信息。建议先对图像数据进行解码与恢复操作。 标注信息同时以JSON格式存储于`caption-bboxbyqwen-dataset.jsonl`文件中,可用于轻量化加载标注数据。 ## 示例用法 通过以下命令下载数据集: bash git lfs install git clone https://www.modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet.git 或 bash modelscope download --dataset DiffSynth-Studio/EliGenTrainSet 随后可通过以下Python代码读取Parquet文件以提取源图像与标注信息: python import pandas as pd import base64 from PIL import Image import io import os parquet_file_template = 'output_parquet_files/part-{}.parquet' out_image_folder = 'output_images' os.makedirs(out_image_folder, exist_ok=True) for file_num in range(100): file_name = f"{file_num:05d}" parquet_file = parquet_file_template.format(file_name) if not os.path.exists(parquet_file): print(f"文件 {parquet_file} 不存在,将跳过。") continue df = pd.read_parquet(parquet_file) for i, row in df.iterrows(): try: # 全局提示文本 caption = row['caption'] # 实体描述及其边界框:格式为 [x1, y1, x2, y2] all_entities = row.get("entities", []) # 图像ID image_id = row['image_id'] # 图像数据 image_data = base64.b64decode(row['image']) image = Image.open(io.BytesIO(image_data)) image.save(os.path.join(out_image_folder, f"{image_id}.png")) except: continue print(f"已处理文件 {parquet_file}。") ## 引用 若您的工作用到了本数据集,请引用以下文献: @article{zhang2025eligen, title={EliGen: Entity-level controlled image generation with regional attention}, author={Zhang, Hong and Duan, Zhongjie and Wang, Xingjun and Chen, Yingda and Zhang, Yu}, journal={arXiv preprint arXiv:2501.01097}, year={2025} }
提供机构:
maas
创建时间:
2025-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作