Data from: Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence
收藏Mendeley Data2024-04-17 更新2024-06-28 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.m63xsj497
下载链接
链接失效反馈官方服务:
资源简介:
# Data from: Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence [https://doi.org/10.5061/dryad.m63xsj497](https://doi.org/10.5061/dryad.m63xsj497) Including the following datasets: **Dataset 1:** CIFAR-10 dataset with image size 80 by 80 pixels. **Dataset 2:** Mini ImageNet dataset for 100 category classification. We chose 100 categories from the original total of 1000 categories in the ILSVRC2012 ImageNet dataset. All the images in this modified dataset are in a resolution of 64 by 64 pixels. The category ID of the selected 100 categories are: 22, 29, 30, 33, 42, 48, 54, 72, 94, 101, 104, 119, 125, 135, 147, 148, 163, 168, 175, 194, 197, 206, 242, 245, 246, 247, 258, 273, 281, 291, 302, 309, 318, 323, 333, 334, 338, 363, 367, 371, 382, 397, 425, 431, 434, 445, 461, 463, 466, 477, 491, 492, 499, 501, 506, 527, 536, 557, 561, 570, 579, 585, 593, 608, 612, 617, 621, 629, 631, 648, 690, 697, 732, 733, 741, 748, 759, 766, 768, 788, 789, 791, 799, 816, 841, 852, 862, 869, 875, 885, 907, 911, 922, 927, 932, 936, 941, 953, 982, 990. The IDs are identical with the original ILSVRC2012 ImageNet dataset. **Dataset 3:** Omniglot character dataset. The original one is presented in '[https://github.com/brendenlake/omniglot](https://github.com/brendenlake/omniglot)', which is a dataset containing 1623 categories. **Dataset 4:** Bach Chorales dataset. The original one is presented in '[https://archive.ics.uci.edu/dataset/25/bach+chorales](https://archive.ics.uci.edu/dataset/25/bach+chorales)'. This dataset is used for training the content generation network. --- The files in this repository are six seperate compressed folder for the four datasets. Please download and extract them for further usage. For Dataset 1 (Dataset1_CIFAR10.zip), you will find four python numpy .npy format file containing the training and testing data along with their ground truth data. Using numpy.load() function to read them. For Dataset 2 (Dataset2_MinilmageNet.zip, Dataset2_MinilmageNet.z01, Dataset2_MinilmageNet.z02), you will find a seperate validation data file, a folder for all training data (in 10 seperata files), and a .py script to load these data to your programme. The required python packages for this script are listed in the beginning lines in the script itself. (Note that you need to download all the three .zip files and then extract them together.) Please change the file path in the python script to match your file structure. For Dataset 3 (Dataset3_Omniglot.zip), you will find a folder containing the character images and character names, and a seperate .py script to load the images. The data structure in the extracted 'Omniglot_Full' folder goes like 'dataset name - language name - character sequence - images for this character'. The attached python script will automaticlly read the entire dataset and form the 'input data - ground truth' pairs for network training. Please change the file path in the python script to match your file structure. For Dataset 4 (Dataset4_BachChorales.zip), you will find a folder containing 413 clips of original Bach chorales in .mxl format. These files can be processed using a variety of music processing softwares or python callable packages such as MuseScore. The notes in each music chip could be mapped to a series of vectors to fit the neural network scale. --- Even though the datasets mentioned in this repository are all public dataset, the actual files for them are slightly different from the original one. The image or music clip sizes are modified and unified to match our photonic chip tests. Please cite the original paper for each dataset as well as this repository if you considered to use these data in your research. Thank you very much. I, the copyright holder of this work, hereby publish it under the following license: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.
创建时间:
2024-04-13



