TimeTravel
收藏魔搭社区2026-01-09 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/MBZUAI/TimeTravel
下载链接
链接失效反馈官方服务:
资源简介:
<div align="center" style="margin-top:10px;">
<img src='asset/logo.png' align="left" width="7%" />
</div>
<div style="margin-top:50px;">
<h1 style="font-size: 30px; margin: 0;"> TimeTravel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts</h1>
</div>
<div align="center" style="margin-top:10px;">
[Sara Ghaboura](https://huggingface.co/SLMLAH) <sup> * </sup>
[Ketan More](https://github.com/ketanmore2002) <sup> * </sup>
[Retish Thawkar](https://huggingface.co/SLMLAH)
[Wafa Alghallabi](https://huggingface.co/SLMLAH)
[Omkar Thawakar](https://omkarthawakar.github.io)
<br>
[Fahad Shahbaz Khan](https://scholar.google.com/citations?hl=en&user=zvaeYnUAAAAJ)
[Hisham Cholakkal](https://scholar.google.com/citations?hl=en&user=bZ3YBRcAAAAJ)
[Salman Khan](https://scholar.google.com/citations?hl=en&user=M59O9lkAAAAJ)
[Rao M. Anwer](https://scholar.google.com/citations?hl=en&user=_KlvMVoAAAAJ)<br>
<em> <sup> *Equal Contribution </sup> </em>
<br>
</div>
<div align="center" style="margin-top:10px;">
[](https://arxiv.org/abs/2502.14865)
[](https://mbzuai-oryx.github.io/TimeTravel/)
## 🏛 TimeTravel Taxonomy and Diversity
<p align="left">
TimeTravel Taxonomy maps artifacts from 10 civilizations, 266 cultures, and 10k+ verified samples for AI-driven historical analysis.
</p>
<p align="center">
<img src="asset/Intro.png" width="750px" height="400px" alt="tax" style="margin-right: 2px";/>
</p>
</div>
<br>
## 🌟 Key Features
TimeTravel is the first large-scale, open-source benchmark designed to evaluate Large Multimodal Models (LMMs) on historical and cultural artifacts. It covers:
- **266** Cultural Groups across **10** Historical Regions
- **10,000+** Expert-Verified Artifact Samples
- **Multimodal Image-Text Dataset** for AI-driven historical research
- A **publicly available dataset** and evaluation framework to advance AI applications in **history and archaeology**.
<br>
## 🔄 TimeTravel Creation Pipeline
The TimeTravel dataset follows a structured pipeline to ensure the accuracy, completeness, and contextual richness of historical artifacts.<br>
<p align="center">
<img src="asset/pipe_last.png" width="750px" height="150px" alt="pipeline" style="margin-right: 2px";/>
</p>
Our approach consists of four key phases:
- **Data Selection:** Curated 10,250 artifacts from museum collections, spanning 266 cultural groups, with expert validation to ensure historical accuracy and diversity.<br>
- **Data Cleaning:** Addressed missing or incomplete metadata (titles, dates, iconography) by cross-referencing museum archives and academic sources, ensuring data consistency.<br>
- **Generation & Verification:** Used GPT-4o to generate context-aware descriptions, which were refined and validated by historians and archaeologists for authenticity.<br>
- **Data Aggregation:** Standardized and structured dataset into image-text pairs, making it a valuable resource for AI-driven historical analysis and cultural heritage research.<br>
<br>
## 🏆 TimeTravel Evaluation
The table below showcases the performance comparison of various closed and open-source models on our proposed TimeTravel benchmark.
<div align="center";>
<h5>
<table>
<thead>
<tr style="background-color: #EBD9B3; color: white;">
<th>Model</th>
<th>BLEU</th>
<th>METEOR</th>
<th>ROUGE-L</th>
<th>SPICE</th>
<th>BERTScore</th>
<th>LLM-Judge</th>
</tr>
</thead>
<tbody>
<tr>
<td>GPT-4o-0806</td>
<td><b>0.1758🏅</b></td>
<td>0.2439</td>
<td><b>0.1230🏅</b></td>
<td><b>0.1035🏅</b></td>
<td><b>0.8349🏅</b></td>
<td><b>0.3013🏅</b></td>
</tr>
<tr>
<td>Gemini-2.0-Flash</td>
<td>0.1072</td>
<td>0.2456</td>
<td>0.0884</td>
<td>0.0919</td>
<td>0.8127</td>
<td>0.2630</td>
</tr>
<tr>
<td>Gemini-1.5-Pro</td>
<td>0.1067</td>
<td>0.2406</td>
<td>0.0848</td>
<td>0.0901</td>
<td>0.8172</td>
<td>0.2276</td>
</tr>
<tr>
<td>GPT-4o-mini-0718</td>
<td>0.1369</td>
<td><b>0.2658🏅</b></td>
<td>0.1027</td>
<td>0.1001</td>
<td>0.8283</td>
<td>0.2492</td>
</tr>
<tr>
<td>Llama-3.2-Vision-Inst</td>
<td>0.1161</td>
<td>0.2072</td>
<td>0.1027</td>
<td>0.0648</td>
<td>0.8111</td>
<td>0.1255</td>
</tr>
<tr>
<td>Qwen-2.5-VL</td>
<td>0.1155</td>
<td>0.2648</td>
<td>0.0887</td>
<td>0.1002</td>
<td>0.8198</td>
<td>0.1792</td>
</tr>
<tr>
<td>Llava-Next</td>
<td>0.1118</td>
<td>0.2340</td>
<td>0.0961</td>
<td>0.0799</td>
<td>0.8246</td>
<td>0.1161</td>
</tr>
</tbody>
</table>
</h5>
<p>
<div align="center";>
<h5>
<table>
<thead>
<tr style="background-color: #EBD9B3; color: white;">
<th>Model</th>
<th>India</th>
<th>Roman Emp.</th>
<th>China</th>
<th>British Isles</th>
<th>Iran</th>
<th>Iraq</th>
<th>Japan</th>
<th>Cent. America</th>
<th>Greece</th>
<th>Egypt</th>
</tr>
</thead>
<tbody>
<tr>
<td>GPT-4o-0806</td>
<td><b>0.2491🏅</b></td>
<td><b>0.4463🏅</b></td>
<td><b>0.2491🏅</b></td>
<td><b>0.1899🏅</b></td>
<td><b>0.3522🏅</b></td>
<td><b>0.3545🏅</b></td>
<td><b>0.2228🏅</b></td>
<td><b>0.3144🏅</b></td>
<td><b>0.2757🏅</b></td>
<td><b>0.3649🏅</b></td>
</tr>
<tr>
<td>Gemini-2.0-Flash</td>
<td>0.1859</td>
<td>0.3358</td>
<td>0.2059</td>
<td>0.1556</td>
<td>0.3376</td>
<td>0.3071</td>
<td>0.2000</td>
<td>0.2677</td>
<td>0.2582</td>
<td>0.3602</td>
</tr>
<tr>
<td>Gemini-1.5-Pro</td>
<td>0.1118</td>
<td>0.2632</td>
<td>0.2139</td>
<td>0.1545</td>
<td>0.3320</td>
<td>0.2587</td>
<td>0.1871</td>
<td>0.2708</td>
<td>0.2088</td>
<td>0.2908</td>
</tr>
<tr>
<td>GPT-4o-mini-0718</td>
<td>0.2311</td>
<td>0.3612</td>
<td>0.2207</td>
<td>0.1866</td>
<td>0.2991</td>
<td>0.2632</td>
<td>0.2087</td>
<td>0.3195</td>
<td>0.2101</td>
<td>0.2501</td>
</tr>
<tr>
<td>Llama-3.2-Vision-Inst</td>
<td>0.0744</td>
<td>0.1450</td>
<td>0.1227</td>
<td>0.0777</td>
<td>0.2000</td>
<td>0.1155</td>
<td>0.1075</td>
<td>0.1553</td>
<td>0.1351</td>
<td>0.1201</td>
</tr>
<tr>
<td>Qwen-2.5-VL</td>
<td>0.0888</td>
<td>0.1578</td>
<td>0.1192</td>
<td>0.1713</td>
<td>0.2515</td>
<td>0.1576</td>
<td>0.1771</td>
<td>0.1442</td>
<td>0.1442</td>
<td>0.2660</td>
</tr>
<tr>
<td>Llava-Next</td>
<td>0.0788</td>
<td>0.0961</td>
<td>0.1455</td>
<td>0.1091</td>
<td>0.1464</td>
<td>0.1194</td>
<td>0.1353</td>
<td>0.1917</td>
<td>0.1111</td>
<td>0.0709</td>
</tr>
</tbody>
</table>
</h5>
<p>
<div align="left"></div>
<br>
## 🖼 TimeTravel Examples
<p align="left">
The figure illustrates the cultural and material diversity of the TimeTravel dataset.
</p>
<p align="center">
<img src="asset/fig0.png" width="1000px" height="250px" alt="tax" style="margin-right: 2px";/>
</p>
<div align="left";>
<br>
<div class="tree-container">
<h2>📂 TimeTravle Dataset Schema</h2>
<div class="tree">
<ul>
<li><span class="leaf">📷 Image</span> (image)</li>
<li><span class="leaf">🔹 id</span> (string)</li>
<li><span class="leaf">📅 Production date</span> (string)</li>
<li><span class="leaf">📍 Find spot</span> (string)</li>
<li><span class="leaf">🔸 Materials</span> (string)</li>
<li><span class="leaf">🛠 Technique</span> (string)</li>
<li><span class="leaf">📝 Inscription</span> (string)</li>
<li><span class="leaf">🎭 Subjects</span> (string)</li>
<li><span class="leaf">📛 Assoc name</span> (string)</li>
<li><span class="leaf">🏛 Culture</span> (string)</li>
<li><span class="leaf">📂 Section</span> (string)</li>
<li><span class="leaf">🌍 Place</span> (string)</li>
<li><span class="leaf">📝 description</span> (string)</li>
</ul>
</div>
</div>
</div>
<br>
## 📚 Citation
<p align="left">
If you use TimeTravle dataset in your research, please consider citing:
</p>
<div align="left">
```bibtex
@misc{ghaboura2025timetravelcomprehensivebenchmark,
title={Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts},
author={Sara Ghaboura and Ketan More and Ritesh Thawkar and Wafa Alghallabi and Omkar Thawakar and Fahad Shahbaz Khan and Hisham Cholakkal and Salman Khan and Rao Muhammad Anwer},
year={2025},
eprint={2502.14865},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.14865},
}
```
</div>
</div>
---
<div align="center" style="margin-top:10px;">
<img src='asset/logo.png' align="left" width="7%" />
</div>
<div style="margin-top:50px;">
<h1 style="font-size: 30px; margin: 0;">TimeTravel:用于评估历史与文化文物场景下大语言多模态模型的综合基准数据集</h1>
</div>
<div align="center" style="margin-top:10px;">
[Sara Ghaboura](https://huggingface.co/SLMLAH) <sup>*</sup> [Ketan More](https://github.com/ketanmore2002) <sup>*</sup> [Retish Thawkar](https://huggingface.co/SLMLAH) [Wafa Alghallabi](https://huggingface.co/SLMLAH) [Omkar Thawakar](https://omkarthawakar.github.io) <br>
[Fahad Shahbaz Khan](https://scholar.google.com/citations?hl=en&user=zvaeYnUAAAAJ) [Hisham Cholakkal](https://scholar.google.com/citations?hl=en&user=bZ3YBRcAAAAJ) [Salman Khan](https://scholar.google.com/citations?hl=en&user=M59O9lkAAAAJ) [Rao M. Anwer](https://scholar.google.com/citations?hl=en&user=_KlvMVoAAAAJ)<br>
<em><sup>*</sup> 同等贡献</em>
<br>
</div>
<div align="center" style="margin-top:10px;">
[](https://arxiv.org/abs/2502.14865)
[](https://mbzuai-oryx.github.io/TimeTravel/)
</div>
## 🏛 TimeTravel 分类体系与多样性
<p align="left">
TimeTravel分类体系覆盖10大文明、266种文化群体,包含10000余件经过验证的样本,用于支撑AI驱动的历史分析任务。
</p>
<p align="center">
<img src="asset/Intro.png" width="750px" height="400px" alt="tax" style="margin-right: 2px;" />
</p>
<br>
## 🌟 核心特性
TimeTravel是首个大规模开源基准数据集,专为历史与文化文物场景下的大语言多模态模型(Large Multimodal Models, LMMs)评测设计,其涵盖:
- **266** 个文化群体,覆盖 **10** 个历史区域
- **10000+** 件经专家验证的文物样本
- 用于AI驱动历史研究的**多模态图文数据集**
- 可公开获取的数据集与评测框架,以推动人工智能在历史学与考古学领域的应用。
<br>
## 🔄 TimeTravel 构建流程
TimeTravel数据集遵循标准化构建流程,以确保历史文物数据的准确性、完整性与上下文丰富性。<br>
<p align="center">
<img src="asset/pipe_last.png" width="750px" height="150px" alt="pipeline" style="margin-right: 2px;" />
</p>
我们的构建流程包含四个关键阶段:
- **数据遴选**:从全球博物馆馆藏中精选10250件文物,覆盖266个文化群体,经领域专家验证以确保历史准确性与样本多样性。
- **数据清洗**:通过交叉比对博物馆官方档案与权威学术文献,补全缺失或不完整的元数据(包括文物名称、制作年代、图像学信息等),保障数据一致性。
- **生成与验证**:使用GPT-4o生成上下文贴合的文物描述文本,经历史学家与考古学家润色并验证,确保内容的真实性与专业性。
- **数据聚合**:将所有数据标准化为图文对格式,为AI驱动的历史分析与文化遗产研究提供高质量资源。
<br>
## 🏆 TimeTravel 评测结果
下表展示了多款闭源与开源模型在TimeTravel基准数据集上的性能对比。
<div align="center">
<h5>
<table>
<thead>
<tr style="background-color: #EBD9B3; color: white;">
<th>模型</th>
<th>BLEU</th>
<th>METEOR</th>
<th>ROUGE-L</th>
<th>SPICE</th>
<th>BERTScore</th>
<th>LLM-Judge</th>
</tr>
</thead>
<tbody>
<tr>
<td>GPT-4o-0806</td>
<td><b>0.1758🏅</b></td>
<td>0.2439</td>
<td><b>0.1230🏅</b></td>
<td><b>0.1035🏅</b></td>
<td><b>0.8349🏅</b></td>
<td><b>0.3013🏅</b></td>
</tr>
<tr>
<td>Gemini-2.0-Flash</td>
<td>0.1072</td>
<td>0.2456</td>
<td>0.0884</td>
<td>0.0919</td>
<td>0.8127</td>
<td>0.2630</td>
</tr>
<tr>
<td>Gemini-1.5-Pro</td>
<td>0.1067</td>
<td>0.2406</td>
<td>0.0848</td>
<td>0.0901</td>
<td>0.8172</td>
<td>0.2276</td>
</tr>
<tr>
<td>GPT-4o-mini-0718</td>
<td>0.1369</td>
<td><b>0.2658🏅</b></td>
<td>0.1027</td>
<td>0.1001</td>
<td>0.8283</td>
<td>0.2492</td>
</tr>
<tr>
<td>Llama-3.2-Vision-Inst</td>
<td>0.1161</td>
<td>0.2072</td>
<td>0.1027</td>
<td>0.0648</td>
<td>0.8111</td>
<td>0.1255</td>
</tr>
<tr>
<td>Qwen-2.5-VL</td>
<td>0.1155</td>
<td>0.2648</td>
<td>0.0887</td>
<td>0.1002</td>
<td>0.8198</td>
<td>0.1792</td>
</tr>
<tr>
<td>Llava-Next</td>
<td>0.1118</td>
<td>0.2340</td>
<td>0.0961</td>
<td>0.0799</td>
<td>0.8246</td>
<td>0.1161</td>
</tr>
</tbody>
</table>
</h5>
</div>
<div align="center">
<h5>
<table>
<thead>
<tr style="background-color: #EBD9B3; color: white;">
<th>模型</th>
<th>印度</th>
<th>罗马帝国</th>
<th>中国</th>
<th>不列颠群岛</th>
<th>伊朗</th>
<th>伊拉克</th>
<th>日本</th>
<th>中美洲</th>
<th>希腊</th>
<th>埃及</th>
</tr>
</thead>
<tbody>
<tr>
<td>GPT-4o-0806</td>
<td><b>0.2491🏅</b></td>
<td><b>0.4463🏅</b></td>
<td><b>0.2491🏅</b></td>
<td><b>0.1899🏅</b></td>
<td><b>0.3522🏅</b></td>
<td><b>0.3545🏅</b></td>
<td><b>0.2228🏅</b></td>
<td><b>0.3144🏅</b></td>
<td><b>0.2757🏅</b></td>
<td><b>0.3649🏅</b></td>
</tr>
<tr>
<td>Gemini-2.0-Flash</td>
<td>0.1859</td>
<td>0.3358</td>
<td>0.2059</td>
<td>0.1556</td>
<td>0.3376</td>
<td>0.3071</td>
<td>0.2000</td>
<td>0.2677</td>
<td>0.2582</td>
<td>0.3602</td>
</tr>
<tr>
<td>Gemini-1.5-Pro</td>
<td>0.1118</td>
<td>0.2632</td>
<td>0.2139</td>
<td>0.1545</td>
<td>0.3320</td>
<td>0.2587</td>
<td>0.1871</td>
<td>0.2708</td>
<td>0.2088</td>
<td>0.2908</td>
</tr>
<tr>
<td>GPT-4o-mini-0718</td>
<td>0.2311</td>
<td>0.3612</td>
<td>0.2207</td>
<td>0.1866</td>
<td>0.2991</td>
<td>0.2632</td>
<td>0.2087</td>
<td>0.3195</td>
<td>0.2101</td>
<td>0.2501</td>
</tr>
<tr>
<td>Llama-3.2-Vision-Inst</td>
<td>0.0744</td>
<td>0.1450</td>
<td>0.1227</td>
<td>0.0777</td>
<td>0.2000</td>
<td>0.1155</td>
<td>0.1075</td>
<td>0.1553</td>
<td>0.1351</td>
<td>0.1201</td>
</tr>
<tr>
<td>Qwen-2.5-VL</td>
<td>0.0888</td>
<td>0.1578</td>
<td>0.1192</td>
<td>0.1713</td>
<td>0.2515</td>
<td>0.1576</td>
<td>0.1771</td>
<td>0.1442</td>
<td>0.1442</td>
<td>0.2660</td>
</tr>
<tr>
<td>Llava-Next</td>
<td>0.0788</td>
<td>0.0961</td>
<td>0.1455</td>
<td>0.1091</td>
<td>0.1464</td>
<td>0.1194</td>
<td>0.1353</td>
<td>0.1917</td>
<td>0.1111</td>
<td>0.0709</td>
</tr>
</tbody>
</table>
</h5>
</div>
<br>
## 🖼 TimeTravel 样本示例
下图展示了TimeTravel数据集的文化与物质多样性。
<p align="center">
<img src="asset/fig0.png" width="1000px" height="250px" alt="tax" style="margin-right: 2px;" />
</p>
<div align="left">
<br>
<h2>📂 TimeTravel 数据集架构</h2>
<div class="tree">
<ul>
<li><span class="leaf">📷 图像(image)</span></li>
<li><span class="leaf">🔹 编号(id)</span></li>
<li><span class="leaf">📅 制作年代(Production date)</span></li>
<li><span class="leaf">📍 出土地点(Find spot)</span></li>
<li><span class="leaf">🔸 材质(Materials)</span></li>
<li><span class="leaf">🛠 制作工艺(Technique)</span></li>
<li><span class="leaf">📝 铭文(Inscription)</span></li>
<li><span class="leaf">🎭 主题(Subjects)</span></li>
<li><span class="leaf">📛 关联名称(Assoc name)</span></li>
<li><span class="leaf">🏛 文化属性(Culture)</span></li>
<li><span class="leaf">📂 分类(Section)</span></li>
<li><span class="leaf">🌍 出土地域(Place)</span></li>
<li><span class="leaf">📝 描述文本(description)</span></li>
</ul>
</div>
</div>
<br>
## 📚 引用方式
若您在研究中使用TimeTravel数据集,请引用以下文献:
<div align="left">
bibtex
@misc{ghaboura2025timetravelcomprehensivebenchmark,
title={Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts},
author={Sara Ghaboura and Ketan More and Ritesh Thawkar and Wafa Alghallabi and Omkar Thawakar and Fahad Shahbaz Khan and Hisham Cholakkal and Salman Khan and Rao Muhammad Anwer},
year={2025},
eprint={2502.14865},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.14865},
}
</div>
提供机构:
maas
创建时间:
2025-03-17



