five

Dora-Bench|3D形状重建数据集|几何细节评估数据集

收藏
arXiv2024-12-24 更新2024-12-26 收录
3D形状重建
几何细节评估
下载链接:
https://aruichen.github.io/Dora
下载链接
链接失效反馈
资源简介:
Dora-Bench是一个用于评估3D形状变分自编码器(VAE)重建质量的基准测试数据集,由香港科技大学、字节跳动Seed、LightIllusions和清华大学联合开发。该数据集包含来自多个公开数据集的3D形状,并根据几何复杂度将其分为四个等级:较少细节、中等细节、丰富细节和非常丰富细节。Dora-Bench通过引入Sharp Normal Error (SNE)度量标准,专注于评估几何细节的重建精度,从而提供了比传统随机采样方法更严格的评估框架。该数据集旨在解决3D形状重建中的几何细节丢失问题,推动3D内容生成领域的研究。
提供机构:
香港科技大学, 字节跳动Seed, LightIllusions, 清华大学
创建时间:
2024-12-24
原始信息汇总

Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

数据集概述

Dora 是一个用于3D形状变分自编码器(Variational Auto-Encoders, VAEs)的采样和基准测试的数据集。该数据集旨在为3D形状生成和重建任务提供标准化的评估框架。

作者信息

  • Rui Chen<sup>1,2</sup>
  • Jianfeng Zhang<sup>2*</sup>
  • Yixun Liang<sup>1,3</sup>
  • Guan Luo<sup>2,4</sup>
  • Weiyu Li<sup>1,3</sup>
  • Jiarui Liu<sup>1,3</sup>
  • Xiu Li<sup>2</sup>
  • Xiaoxiao Long<sup>1,3</sup>
  • Jiashi Feng<sup>2</sup>
  • Ping Tan<sup>1,3*</sup>

通讯作者

  • Jianfeng ZhangPing Tan 为通讯作者。

所属机构

  1. 香港科技大学(The Hong Kong University of Science and Technology)
  2. 字节跳动种子(Bytedance Seed)
  3. LightIllusions
  4. 清华大学(Tsinghua University)

相关资源

AI搜集汇总
数据集介绍
main_image_url
构建方式
Dora-Bench数据集的构建旨在系统评估3D变分自编码器(VAE)在不同几何复杂度下的重建质量。该数据集从多个公开数据集中精选了3D模型,包括ABO、GSO、Meta和Objaverse,并根据几何复杂度将模型分为四个等级。每个等级包含约800个样本,确保数据集的多样性和代表性。为了量化几何复杂度,Dora-Bench引入了锐边数量作为分类标准,并通过锐边采样策略(SES)确保模型在训练过程中能够优先捕捉几何细节。此外,数据集还引入了锐法线误差(SNE)这一新指标,专门用于评估模型在几何显著区域的重建精度。
特点
Dora-Bench数据集的主要特点在于其系统化的几何复杂度分类和针对几何细节的评估指标。数据集将3D模型分为四个复杂度等级,从低细节到高细节,覆盖了广泛的几何特征。通过锐边采样策略,数据集能够有效捕捉模型中的几何显著区域,从而提升重建质量。此外,锐法线误差(SNE)指标的引入使得评估更加精细化,能够准确衡量模型在几何细节上的表现。Dora-Bench不仅提供了丰富的测试样本,还通过多视角渲染和Canny边缘检测等技术,确保评估过程的严谨性和全面性。
使用方法
Dora-Bench数据集的使用方法主要包括模型训练和评估两个阶段。在训练阶段,研究人员可以利用数据集中的3D模型,结合锐边采样策略(SES)和双交叉注意力机制(DCA),训练3D变分自编码器(VAE)。在评估阶段,数据集提供了多个评估指标,包括F-score、Chamfer距离和锐法线误差(SNE),用于全面衡量模型的重建质量。研究人员可以通过对比不同复杂度等级下的模型表现,深入分析模型在几何细节捕捉上的能力。此外,Dora-Bench还可用于下游任务,如单图像到3D生成,通过结合扩散模型进一步提升生成质量。
背景与挑战
背景概述
Dora-Bench数据集由香港科技大学、字节跳动Seed、LightIllusions和清华大学的研究团队于2024年提出,旨在解决3D形状变分自编码器(VAE)在形状重建中的几何细节丢失问题。该数据集的核心研究问题是通过引入锐边采样策略和双交叉注意力机制,提升VAE在复杂几何特征上的重建精度。Dora-Bench的提出为3D形状建模领域提供了系统化的评估基准,推动了3D内容生成技术的发展,尤其在游戏、电影和AR/VR等领域的应用中具有重要意义。
当前挑战
Dora-Bench面临的挑战主要包括两个方面:首先,在领域问题方面,传统的均匀采样策略难以捕捉3D形状中的高复杂度几何细节,导致重建质量下降,尤其是在复杂形状的边缘和细节区域。其次,在数据集构建过程中,如何设计一种能够有效识别和优先采样几何显著区域的算法,同时保持全局结构的完整性,是一个技术难点。此外,开发一种能够量化形状复杂度并评估重建精度的新指标(如锐边法向误差)也面临挑战,需要确保其在不同复杂度形状上的普适性和准确性。
常用场景
经典使用场景
Dora-Bench数据集在3D形状生成和重建领域具有广泛的应用,特别是在基于变分自编码器(VAE)的3D形状建模中。该数据集通过引入锐边采样策略和双交叉注意力机制,显著提升了3D形状重建的精度,尤其是在捕捉复杂几何细节方面。Dora-Bench的经典使用场景包括3D形状的压缩、重建以及生成任务,尤其是在需要高保真几何细节的领域,如游戏、电影和增强现实/虚拟现实(AR/VR)中。
实际应用
Dora-Bench数据集在实际应用中具有广泛的价值,特别是在3D内容生成领域。通过其高效的锐边采样策略和双交叉注意力机制,该数据集能够生成高质量的3D形状,适用于游戏、电影、AR/VR等领域的3D建模任务。此外,Dora-Bench还为单图像到3D生成任务提供了强大的支持,通过其紧凑的潜在表示,显著提升了生成模型的效率和生成质量。这些应用使得Dora-Bench成为3D内容生成领域的重要工具。
衍生相关工作
Dora-Bench数据集的推出催生了一系列相关研究工作,特别是在3D形状生成和重建领域。基于Dora-Bench的锐边采样策略和双交叉注意力机制,许多研究团队进一步优化了3D VAE的设计,提升了重建和生成的质量。此外,Dora-Bench还推动了3D形状生成模型的评估方法的发展,特别是通过其引入的锐边法向误差(SNE)指标,使得对几何细节的评估更加精确。这些衍生工作不仅扩展了Dora-Bench的应用范围,还推动了3D内容生成技术的进步。
以上内容由AI搜集并总结生成
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4098个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

Population and Housing Census of 2007 - Ethiopia

Geographic coverage --------------------------- National coverage Analysis unit --------------------------- Household Person Housing unit Universe --------------------------- The census has counted people on dejure and defacto basis. The dejure population comprises all the persons who belong to a given area at a given time by virtue of usual residence, while under defacto approach people were counted as the residents of the place where they found. In the census, a person is said to be a usual resident of a household (and hence an area) if he/she has been residing in the household continuously for at least six months before the census day or intends to reside in the household for six months or longer. Thus, visitors are not included with the usual (dejure) population. Homeless persons were enumerated in the place where they spent the night on the enumeration day. The 2007 census counted foreign nationals who were residing in the city administration. On the other hand all Ethiopians living abroad were not counted. Kind of data --------------------------- Census/enumeration data [cen] Mode of data collection --------------------------- Face-to-face [f2f] Research instrument --------------------------- Two type sof questionnaires were used to collect census data: i) Short questionnaire ii) Long questionnaire Unlike the previous censuses, the contents of the short and long questionnaires were similar both for the urban and rural areas as well as for the entire city. But the short and the long questionnaires differ by the number of variables they contained. That is, the short questionnaire was used to collect basic data on population characteristics, such as population size, sex, age, language, ethnic group, religion, orphanhood and disability. Whereas the long questionnaire includes information on marital status, education, economic activity, migration, fertility, mortality, as well as housing stocks and conditions in addition to those questions contained in a short questionnaire.

catalog.ihsn.org 收录

Canadian Census

**Overview** The data package provides demographics for Canadian population groups according to multiple location categories: Forward Sortation Areas (FSAs), Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs), Federal Electoral Districts (FEDs), Health Regions (HRs) and provinces. **Description** The data are available through the Canadian Census and the National Household Survey (NHS), separated or combined. The main demographic indicators provided for the population groups, stratified not only by location but also for the majority by demographical and socioeconomic characteristics, are population number, females and males, usual residents and private dwellings. The primary use of the data at the Health Region level is for health surveillance and population health research. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information to monitor, plan, implement and evaluate programs to improve the health of Canadians and the efficiency of health services. Researchers from various fields use the information to conduct research to improve health. Non-profit health organizations and the media use the health region data to raise awareness about health, an issue of concern to all Canadians. The Census population counts for a particular geographic area representing the number of Canadians whose usual place of residence is in that area, regardless of where they happened to be on Census Day. Also included are any Canadians who were staying in that area on Census Day and who had no usual place of residence elsewhere in Canada, as well as those considered to be 'non-permanent residents'. National Household Survey (NHS) provides demographic data for various levels of geography, including provinces and territories, census metropolitan areas/census agglomerations, census divisions, census subdivisions, census tracts, federal electoral districts and health regions. In order to provide a comprehensive overview of an area, this product presents data from both the NHS and the Census. NHS data topics include immigration and ethnocultural diversity; aboriginal peoples; education and labor; mobility and migration; language of work; income and housing. 2011 Census data topics include population and dwelling counts; age and sex; families, households and marital status; structural type of dwelling and collectives; and language. The data are collected for private dwellings occupied by usual residents. A private dwelling is a dwelling in which a person or a group of persons permanently reside. Information for the National Household Survey does not include information for collective dwellings. Collective dwellings are dwellings used for commercial, institutional or communal purposes, such as a hotel, a hospital or a work camp. **Benefits** - Useful for canada public health stakeholders, for public health specialist or specialized public and other interested parties. for health surveillance and population health research. for monitoring, planning, implementation and evaluation of health-related programs. media agencies may use the health regions data to raise awareness about health, an issue of concern to all canadians. giving the addition of longitude and latitude in some of the datasets the data can be useful to transpose the values into geographical representations. the fields descriptions along with the dataset description are useful for the user to quickly understand the data and the dataset. **License Information** The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes. **Included Datasets** - [Canadian Population and Dwelling by FSA 2011](https://www.johnsnowlabs.com/marketplace/canadian-population-and-dwelling-by-fsa-2011) - This Canadian Census dataset covers data on population, total private dwellings and private dwellings occupied by usual residents by forward sortation area (FSA). It is enriched with the percentage of the population or dwellings versus the total amount as well as the geographical area, province, and latitude and longitude. The whole Canada's population is marked as 100, referring to 100% for the percentages. - [Detailed Canadian Population Statistics by CMAs and CAs 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-cmas-and-cas-2011) - This dataset covers the population statistics of Canada by Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs). It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by FED 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-fed-2011) - This dataset covers the population statistics of Canada from 2011 by Federal Electoral District of 2013 Representation Order. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Health Region 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-health-region-2011) - This dataset covers the population statistics of Canada by health region. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Province 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-province-2011) - This dataset covers the population statistics of Canada by provinces and territories. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. **Data Engineering Overview** **We deliver high-quality data** - Each dataset goes through 3 levels of quality review - 2 Manual reviews are done by domain experts - Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints - Data is normalized into one unified type system - All dates, unites, codes, currencies look the same - All null values are normalized to the same value - All dataset and field names are SQL and Hive compliant - Data and Metadata - Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters - Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated - Data Updates - Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted **Our data is curated and enriched by domain experts** Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts: - Field names, descriptions, and normalized values are chosen by people who actually understand their meaning - Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset - Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations - The data is always kept up to date – even when the source requires manual effort to get updates - Support for data subscribers is provided directly by the domain experts who curated the data sets - Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution. **Need Help?** If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).

Databricks 收录

NAEP - National Assessment of Educational Progress

NAEP(国家教育进展评估)数据集包含了美国全国范围内对学生学术成就的定期评估结果。该数据集涵盖了多个学科领域,如阅读、数学、科学等,并提供了不同年级和不同州的数据。数据集还包括了学生的背景信息和社会经济因素,以帮助分析教育成就的影响因素。

nces.ed.gov 收录

全国 1∶200 000 数字地质图(公开版)空间数据库

As the only one of its kind, China National Digital Geological Map (Public Version at 1∶200 000 scale) Spatial Database (CNDGM-PVSD) is based on China' s former nationwide measured results of regional geological survey at 1∶200 000 scale, and is also one of the nationwide basic geosciences spatial databases jointly accomplished by multiple organizations of China. Spatially, it embraces 1 163 geological map-sheets (at scale 1: 200 000) in both formats of MapGIS and ArcGIS, covering 72% of China's whole territory with a total data volume of 90 GB. Its main sources is from 1∶200 000 regional geological survey reports, geological maps, and mineral resources maps with an original time span from mid-1950s to early 1990s. Approved by the State's related agencies, it meets all the related technical qualification requirements and standards issued by China Geological Survey in data integrity, logic consistency, location acc racy, attribution fineness, and collation precision, and is hence of excellent and reliable quality. The CNDGM-PVSD is an important component of China' s national spatial database categories, serving as a spatial digital platform for the information construction of the State's national economy, and providing informationbackbones to the national and provincial economic planning, geohazard monitoring, geological survey, mineral resources exploration as well as macro decision-making.

DataCite Commons 收录

中国1km分辨率逐月降水量数据集(1901-2023)

该数据集为中国逐月降水量数据,空间分辨率为0.0083333°(约1km),时间为1901.1-2023.12。数据格式为NETCDF,即.nc格式。该数据集是根据CRU发布的全球0.5°气候数据集以及WorldClim发布的全球高分辨率气候数据集,通过Delta空间降尺度方案在中国降尺度生成的。并且,使用496个独立气象观测点数据进行验证,验证结果可信。本数据集包含的地理空间范围是全国主要陆地(包含港澳台地区),不含南海岛礁等区域。为了便于存储,数据均为int16型存于nc文件中,降水单位为0.1mm。 nc数据可使用ArcMAP软件打开制图; 并可用Matlab软件进行提取处理,Matlab发布了读入与存储nc文件的函数,读取函数为ncread,切换到nc文件存储文件夹,语句表达为:ncread (‘XXX.nc’,‘var’, [i j t],[leni lenj lent]),其中XXX.nc为文件名,为字符串需要’’;var是从XXX.nc中读取的变量名,为字符串需要’’;i、j、t分别为读取数据的起始行、列、时间,leni、lenj、lent i分别为在行、列、时间维度上读取的长度。这样,研究区内任何地区、任何时间段均可用此函数读取。Matlab的help里面有很多关于nc数据的命令,可查看。数据坐标系统建议使用WGS84。

国家青藏高原科学数据中心 收录