EliGenTrainSet
收藏魔搭社区2026-05-16 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet
下载链接
链接失效反馈官方服务:
资源简介:
## Train Dataset of EliGen
* Paper: [EliGen: Entity-Level Controlled Image Generation with Regional Attention](https://arxiv.org/abs/2501.01097)
* Github: [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)
* Model: [ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen)
* Online Demo: [ModelScope EliGen Studio](https://www.modelscope.cn/studios/DiffSynth-Studio/EliGen)
## Dataset Description
All image data is stored in parquet format. Each sample contains the following fields:
* image_id: unique id for each image, `000009`
* caption: global prompt, eg. `A cartoon chicken dressed in a suit and tie.`
* entities: entity descriptions and their bounding boxes, eg. `[{"entity": "cartoon chicken", "bbox": [0.145, 0.06, 0.854, 0.94]}]`
* image: base64 encoded image data.
For better usage, we provide a python script to read the parquet files and extract the source image and annotations. It is recommended to decode and restore the images first.
The annotations is also stored in json format in `caption-bboxbyqwen-dataset.jsonl`, you may use it for lightweight loading of the annotations.
## Example Usage
download dataset using
```bash
git lfs install
git clone https://www.modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet.git
```
or
```bash
modelscope download --dataset DiffSynth-Studio/EliGenTrainSet
```
Then you can use the following python code to read the parquet files to extract the source image and annotations.
```python
import pandas as pd
import base64
from PIL import Image
import io
import os
parquet_file_template = 'output_parquet_files/part-{}.parquet'
out_image_folder = 'output_images'
os.makedirs(out_image_folder, exist_ok=True)
for file_num in range(100):
file_name = f"{file_num:05d}"
parquet_file = parquet_file_template.format(file_name)
if not os.path.exists(parquet_file):
print(f"File {parquet_file} does not exist, skipping.")
continue
df = pd.read_parquet(parquet_file)
for i, row in df.iterrows():
try:
# global prompt
caption = row['caption']
# entity descriptions and their bounding boxes: [x1, y1, x2, y2]
all_entities = row.get("entities", [])
# image id
image_id = row['image_id']
# image data
image_data = base64.b64decode(row['image'])
image = Image.open(io.BytesIO(image_data))
image.save(os.path.join(out_image_folder, f"{image_id}.png"))
except:
continue
print(f"File {parquet_file} processed.")
```
## Citation
If you find our work helpful, feel free to give us a cite.
```
@article{zhang2025eligen,
title={Eligen: Entity-level controlled image generation with regional attention},
author={Zhang, Hong and Duan, Zhongjie and Wang, Xingjun and Chen, Yingda and Zhang, Yu},
journal={arXiv preprint arXiv:2501.01097},
year={2025}
}
```
# EliGen训练数据集
* 论文:[EliGen:基于实体级控制的区域注意力图像生成(EliGen: Entity-Level Controlled Image Generation with Regional Attention)](https://arxiv.org/abs/2501.01097)
* GitHub仓库:[DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)
* 模型:[ModelScope](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen)
* 在线演示:[ModelScope EliGen Studio](https://www.modelscope.cn/studios/DiffSynth-Studio/EliGen)
## 数据集描述
所有图像数据均以Parquet(列式存储数据格式)格式存储。每个样本包含以下字段:
* `image_id`:每张图像的唯一标识符,示例为`000009`
* `caption`:全局提示文本,示例为`身着西装领带的卡通小鸡`(原英文为`A cartoon chicken dressed in a suit and tie.`)
* `entities`:实体描述及其边界框,示例为`[{"entity": "cartoon chicken", "bbox": [0.145, 0.06, 0.854, 0.94]}]`
* `image`:Base64编码的图像数据。
为便于使用,我们提供了Python脚本用于读取Parquet文件并提取源图像与标注信息。建议先对图像数据进行解码与恢复操作。
标注信息同时以JSON格式存储于`caption-bboxbyqwen-dataset.jsonl`文件中,可用于轻量化加载标注数据。
## 示例用法
通过以下命令下载数据集:
bash
git lfs install
git clone https://www.modelscope.cn/datasets/DiffSynth-Studio/EliGenTrainSet.git
或
bash
modelscope download --dataset DiffSynth-Studio/EliGenTrainSet
随后可通过以下Python代码读取Parquet文件以提取源图像与标注信息:
python
import pandas as pd
import base64
from PIL import Image
import io
import os
parquet_file_template = 'output_parquet_files/part-{}.parquet'
out_image_folder = 'output_images'
os.makedirs(out_image_folder, exist_ok=True)
for file_num in range(100):
file_name = f"{file_num:05d}"
parquet_file = parquet_file_template.format(file_name)
if not os.path.exists(parquet_file):
print(f"文件 {parquet_file} 不存在,将跳过。")
continue
df = pd.read_parquet(parquet_file)
for i, row in df.iterrows():
try:
# 全局提示文本
caption = row['caption']
# 实体描述及其边界框:格式为 [x1, y1, x2, y2]
all_entities = row.get("entities", [])
# 图像ID
image_id = row['image_id']
# 图像数据
image_data = base64.b64decode(row['image'])
image = Image.open(io.BytesIO(image_data))
image.save(os.path.join(out_image_folder, f"{image_id}.png"))
except:
continue
print(f"已处理文件 {parquet_file}。")
## 引用
若您的工作用到了本数据集,请引用以下文献:
@article{zhang2025eligen,
title={EliGen: Entity-level controlled image generation with regional attention},
author={Zhang, Hong and Duan, Zhongjie and Wang, Xingjun and Chen, Yingda and Zhang, Yu},
journal={arXiv preprint arXiv:2501.01097},
year={2025}
}
提供机构:
maas
创建时间:
2025-01-24



