james-burton/OrientalMuseum_min5-white-name
收藏Hugging Face2024-02-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/james-burton/OrientalMuseum_min5-white-name
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: obj_num
dtype: string
- name: file
dtype: string
- name: image
dtype: image
- name: root
dtype: string
- name: description
dtype: string
- name: label
dtype:
class_label:
names:
'0': Aegis
'1': Ajaeng Holder
'2': Album Painting
'3': Amulet Mould
'4': Animal Figurine
'5': Animal Mummy
'6': Animal bone
'7': Arm Guard
'8': Axe Head
'9': Axle-caps
'10': Ball
'11': Ballista Bolt
'12': Band
'13': Basin
'14': Baton
'15': Belt Hook
'16': Betel Nut Cutter
'17': Blouse
'18': Blu-ray disc
'19': Bolt
'20': Book Cover
'21': Box
'22': Brush Pot
'23': Brush Rest
'24': Brush Tray
'25': Bulb Bowl
'26': Bullet Mould
'27': Burnisher
'28': Cabinet
'29': Cannon
'30': Cap
'31': Carved stone
'32': Case
'33': Cash Box
'34': Chest
'35': Cigar Holder
'36': Clapper
'37': Clay pipe (smoking)
'38': Comb
'39': Cosmetic and Medical Equipment and Implements
'40': Cricket pot
'41': Cross-bow Lock
'42': Cup And Saucer
'43': Cup, Saucer
'44': Cushion Cover
'45': DVDs
'46': Dagger
'47': Dice Box
'48': Dice Shaker
'49': Disc
'50': Domestic Equipment and Utensils
'51': Double Dagger
'52': Ear Protector
'53': Ear Stud
'54': Earring
'55': Elephant Goad
'56': Erotic Figurine
'57': Eye Protector
'58': Figurine Mould
'59': Finger Ring
'60': Funerary Cone
'61': Funerary goods
'62': Funerary money
'63': Furosode
'64': Greek crosses
'65': Hand Jade
'66': Hand Protector
'67': Handwarmer
'68': Hanging
'69': Headband
'70': Heart Scarab
'71': Human Figurine
'72': Incense Holder
'73': Inkstick
'74': Kite
'75': Knee Protector
'76': Kohl Pot
'77': Kundika
'78': Leaflet
'79': Letter
'80': Lock
'81': Mah Jong Rack
'82': Majiang set
'83': Manuscript Page
'84': Mat
'85': Mica Painting
'86': Miniature Painting
'87': Miniature Portrait
'88': Mortar
'89': Mould
'90': Mouth Jade
'91': Mouth Protector
'92': Mouth-piece
'93': Mummy Label
'94': Nail Protector
'95': Nose Protector
'96': Opium Pipe
'97': Opium Weight
'98': Oracle Bone
'99': Ostraka
'100': Palette
'101': Panel
'102': Part
'103': Pelmet
'104': Pencase
'105': Pendant
'106': Perfumer
'107': Phylactery
'108': Pigstick
'109': Pipe
'110': Pipe Case
'111': Pipe Holder
'112': Pith Painting
'113': Plaque
'114': Plate
'115': Poh Kam
'116': Pounder
'117': Prayer Wheel
'118': Rank Square
'119': Rubber
'120': Sake Cup
'121': Scabbard Chape
'122': Scabbard Slide
'123': Scarab Seal
'124': Scarf
'125': Score Board
'126': Screen
'127': Seal
'128': Seal Paste Pot
'129': Shaft Terminal
'130': Shield
'131': Shroud Weight
'132': Sleeve Band
'133': Sleeve Weight
'134': Slide
'135': Soles
'136': Spillikins
'137': Staff Head
'138': Stamp
'139': Stand
'140': Stand of Incense Burner
'141': Stem Bowl
'142': Stem Cup
'143': Story Cloth
'144': Strainer
'145': Sword Guard
'146': Table
'147': Table Runner
'148': Thangka
'149': Tomb Figure
'150': Tomb Model
'151': Washer
'152': Water Dropper
'153': Water Pot
'154': Wine Pot
'155': Woodblock Print
'156': Writing Desk
'157': accessories
'158': adzes
'159': alabastra
'160': albums
'161': altar components
'162': amphorae
'163': amulets
'164': anchors
'165': animation cels
'166': animation drawings
'167': anklets
'168': armbands
'169': armor
'170': armrests
'171': arrowheads
'172': arrows
'173': autograph albums
'174': axes
'175': 'axes: woodworking tools'
'176': back scratchers
'177': badges
'178': bags
'179': bandages
'180': bangles
'181': banners
'182': baskets
'183': beads
'184': beakers
'185': bedspreads
'186': bells
'187': belts
'188': bezels
'189': blades
'190': board games
'191': boats
'192': boilers
'193': booklets
'194': books
'195': bottles
'196': bowls
'197': boxes
'198': bracelets
'199': bread
'200': brick
'201': brooches
'202': brush washers
'203': brushes
'204': buckets
'205': buckles
'206': business cards
'207': buttons
'208': caddies
'209': calligraphy
'210': candelabras
'211': candleholders
'212': candlesticks
'213': canopic jars
'214': card cases
'215': card tables
'216': cards
'217': carvings
'218': cases
'219': celestial globes
'220': censers
'221': chains
'222': chairs
'223': charms
'224': charts
'225': chess sets
'226': chessmen
'227': chisels
'228': chopsticks
'229': cigarette cases
'230': cigarette holders
'231': cippi
'232': claypipe
'233': cloth
'234': clothing
'235': coats
'236': coffins
'237': coins
'238': collar
'239': compact discs
'240': containers
'241': coverings
'242': covers
'243': cuffs
'244': cups
'245': cushions
'246': cylinder seals
'247': deels
'248': deity figurine
'249': diagrams
'250': dice
'251': dishes
'252': document containers
'253': documents
'254': dolls
'255': doors
'256': drawings
'257': dresses
'258': drums
'259': dung-chen
'260': earrings
'261': embroidery
'262': ensembles
'263': envelopes
'264': 'equipment for personal use: grooming, hygiene and health care'
'265': ewers
'266': fans
'267': 'feet: furniture components'
'268': female figurine
'269': fiddles
'270': figures
'271': figurines
'272': finials
'273': flagons
'274': flags
'275': flasks
'276': fragments
'277': furniture components
'278': gameboards
'279': gaming counters
'280': ge
'281': glassware
'282': goblets
'283': gongs
'284': gowns
'285': greeting cards
'286': hair ornaments
'287': hairpins
'288': hammerstones
'289': handles
'290': handscrolls
'291': harnesses
'292': hats
'293': headdresses
'294': headrests
'295': heads
'296': headscarves
'297': helmets
'298': hobs
'299': hoods
'300': houses
'301': identity cards
'302': illuminated manuscripts
'303': incense burners
'304': incense sticks
'305': ink bottles
'306': inkstands
'307': inkstones
'308': inkwells
'309': inlays
'310': iron
'311': jackets
'312': jar seal
'313': jars
'314': jewelry
'315': juglets
'316': jugs
'317': keys
'318': kimonos
'319': knives
'320': ladles
'321': lamps
'322': lanterns
'323': lanyards
'324': leatherwork
'325': lids
'326': loom weights
'327': maces
'328': manuscripts
'329': maps
'330': masks
'331': medals
'332': miniatures
'333': mirrors
'334': models
'335': money
'336': mounts
'337': mugs
'338': mummies
'339': musical instruments
'340': nails
'341': necklaces
'342': needles
'343': netsukes
'344': nozzles
'345': obelisks
'346': obis
'347': oboes
'348': oil lamps
'349': ornaments
'350': pages
'351': paintings
'352': paper money
'353': paperweights
'354': papyrus
'355': passports
'356': pectorals
'357': pendants
'358': pestles
'359': petticoats
'360': photograph albums
'361': photographs
'362': pictures
'363': pins
'364': pipes
'365': pitchers
'366': playing card boxes
'367': playing cards
'368': plinths
'369': plumb bobs
'370': plume holders
'371': poker
'372': pommels
'373': postage stamps
'374': postcards
'375': posters
'376': pots
'377': pottery
'378': prayers
'379': printing blocks
'380': printing plates
'381': prints
'382': punch bowls
'383': puppets
'384': purses
'385': puzzles
'386': pyxides
'387': quilts
'388': razors
'389': reliefs
'390': rifles
'391': rings
'392': robes
'393': roofing tile
'394': rose bowls
'395': rubbings
'396': rugs
'397': rulers
'398': sandals
'399': saris
'400': sarongs
'401': sashes
'402': sauceboats
'403': saucers
'404': saws
'405': scabbards
'406': scaraboids
'407': scarabs
'408': scepters
'409': scissors
'410': scrolls
'411': sculpture
'412': seed
'413': seppa
'414': shadow puppets
'415': shawls
'416': shears
'417': shell
'418': shelves
'419': sherds
'420': shields
'421': shoes
'422': shrines
'423': sistra
'424': situlae
'425': sketches
'426': skewers
'427': skirts
'428': snuff bottles
'429': socks
'430': spatulas
'431': spearheads
'432': spears
'433': spittoons
'434': spoons
'435': statues
'436': statuettes
'437': steelyards
'438': stelae
'439': sticks
'440': stirrup jars
'441': stools
'442': stoppers
'443': straps
'444': studs
'445': styluses
'446': sugar bowls
'447': swagger sticks
'448': swords
'449': tablets
'450': tacks
'451': talismans
'452': tallies
'453': tangrams
'454': tankards
'455': tea bowls
'456': tea caddies
'457': tea kettles
'458': teacups
'459': teapots
'460': telephones
'461': ties
'462': tiles
'463': toggles
'464': toilet caskets
'465': tools
'466': toys
'467': trays
'468': trophies
'469': trousers
'470': tubes
'471': tureens
'472': tweezers
'473': typewriters
'474': underwear
'475': unidentified
'476': urinals
'477': ushabti
'478': utensils
'479': vases
'480': veils
'481': vessels
'482': waistcoats
'483': watches
'484': weight
'485': weights
'486': whetstones
'487': whistles
'488': whorls
'489': wood blocks
'490': writing boards
- name: other_name
dtype: string
- name: material
dtype: string
- name: production.period
dtype: string
- name: production.place
dtype: string
- name: new_root
dtype: string
splits:
- name: train
num_bytes: 715994709.0
num_examples: 23100
- name: validation
num_bytes: 140436728.016
num_examples: 5436
- name: test
num_bytes: 209313987.068
num_examples: 5436
download_size: 938636292
dataset_size: 1065745424.084
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
- split: test
path: data/test-*
---
提供机构:
james-burton
原始信息汇总
数据集概述
数据特征
数据集包含以下特征:
obj_num: 字符串类型file: 字符串类型image: 图像类型root: 字符串类型description: 字符串类型label: 分类标签,包含多个类别,如 Aegis, Ajaeng Holder, Album Painting 等。other_name: 字符串类型material: 字符串类型production.period: 字符串类型production.place: 字符串类型new_root: 字符串类型
数据分割
数据集分为三个部分:
train: 包含 23100 个样本,总大小为 715994709 字节。validation: 包含 5436 个样本,总大小为 140436728.016 字节。test: 包含 5436 个样本,总大小为 209313987.068 字节。
数据集大小
- 下载大小:938636292 字节
- 数据集总大小:1065745424.084 字节
配置
config_name: default- 数据文件路径:
train: data/train-*validation: data/validation-*test: data/test-*
搜集汇总
数据集介绍

构建方式
在文化遗产数字化领域,东方博物馆藏品数据集通过系统化采集与标注构建而成。该数据集从博物馆馆藏中精选了超过两万件文物图像,每件文物均附有详细的元数据,包括文物编号、图像文件、描述文本及多维度分类标签。构建过程中采用了严谨的数据清洗流程,确保图像质量与标注一致性,并依据文物类型、材质、制作年代与产地等属性进行结构化组织。数据集划分为训练集、验证集与测试集,为机器学习模型提供了均衡且具有代表性的样本分布。
特点
该数据集的核心特征在于其涵盖的文物类别极为丰富,包含从古代武器、礼器到日常用具、艺术品等近五百种精细分类,展现了东方文明的物质文化多样性。每一条数据均整合了高分辨率图像、多语言描述文本以及标准化的分类标签,形成了多模态信息融合的结构。数据集的标注体系不仅包含文物名称与材质,还扩展至制作时期与产地信息,为跨文化比较研究提供了深层语义关联。这种细致且系统的标注方式,使得数据集在文化遗产识别、分类与检索任务中具备高度的学术价值与应用潜力。
使用方法
该数据集适用于计算机视觉与文化遗产研究的交叉领域,用户可通过加载标准化的数据分割进行模型训练与评估。在图像分类任务中,可利用其丰富的类别标签训练深度神经网络,实现文物类型的自动识别;在多模态学习中,可结合图像与描述文本进行跨模态检索或生成任务。研究人员还可依据材质、时期或产地等元数据开展细分研究,例如特定历史时期的器物风格分析。数据集以HuggingFace平台兼容的格式提供,支持通过标准数据加载工具直接调用,便于集成至现有机器学习流程中。
背景与挑战
背景概述
在文化遗产数字化与人工智能交叉研究领域,东方博物馆藏品的系统性标注与识别成为一项关键课题。数据集OrientalMuseum_min5-white-name由研究者James Burton构建,聚焦于东方艺术与考古文物的多模态分类任务。该数据集收录了涵盖陶器、金属器、纺织品、绘画等超过490类器物的图像与元数据,旨在通过机器学习方法实现对复杂文物形态与功能的自动化辨识。其创建推动了数字人文领域在文物智能归档、跨文化比较研究方面的进展,为博物馆学与计算机视觉的深度融合提供了结构化数据基础。
当前挑战
该数据集致力于解决东方文物细粒度分类的挑战,其类别体系涵盖器型、功能、材质等多维特征,要求模型具备对高度相似器物(如不同时期的陶罐或装饰性玉器)的区分能力。在构建过程中,挑战主要源于文物标注的复杂性:原始数据中器物名称存在多语言混杂、历史术语变异及描述不一致等问题,需进行大量的名称清洗与标准化处理。同时,文物图像在光照、角度及保存状态上的差异,进一步增加了数据归一化与特征提取的难度。
常用场景
经典使用场景
在文化遗产数字化与人工智能交叉领域,OrientalMuseum_min5-white-name数据集为博物馆藏品自动分类与识别提供了关键资源。该数据集汇集了东方博物馆中超过490类文物的图像与元数据,涵盖从古代器物到近现代艺术品,其经典使用场景在于训练深度学习模型进行多类别文物图像分类。研究者利用该数据集构建卷积神经网络或视觉Transformer模型,实现对高分辨率文物图像的精准类别预测,为自动化编目与索引奠定基础。
衍生相关工作
围绕该数据集,已衍生出多项经典研究工作,包括基于注意力机制的文物细粒度分类模型、结合元数据(如材质、年代)的多模态融合识别框架,以及跨博物馆文物的迁移学习应用。这些工作不仅提升了文物分类的准确率,还探索了文物风格迁移、破损图像修复等衍生任务。部分研究进一步将分类模型扩展至文物年代鉴定或产地溯源,推动了计算考古学的发展。
数据集最近研究
最新研究方向
在文化遗产数字化领域,OrientalMuseum_min5-white-name数据集以其丰富的东方博物馆藏品图像与多维度标注,为计算机视觉与数字人文研究提供了关键资源。当前前沿研究聚焦于利用深度学习模型进行跨模态检索与细粒度分类,探索如何通过图像与文本描述(如材质、时期、产地)的联合建模,实现文物的自动识别与知识关联。这一方向与全球博物馆数字化浪潮及AI赋能文化遗产保护的热点事件紧密相连,不仅推动了智能策展与虚拟展览技术的发展,也为历史文物的溯源研究与公众教育提供了新的技术路径,具有重要的学术价值与社会意义。
以上内容由遇见数据集搜集并总结生成



