five

Simulated sPHENIX Time-Projection Chamber (TPC) Data in Central Au-Au Collisions at sqrt[s] = 200 GeV, outer layer group

收藏
Mendeley Data2024-05-10 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/10028587
下载链接
链接失效反馈
官方服务:
资源简介:
This is the dataset we used to train the 2D and 3D Bicephalous Convolutional Autoencoders (BCAEs) described in "Fast 2D Bicephalous Convolutional Autoencoder for Compressing 3D Time Projection Chamber Data" published in the 9th International Workshop on Data Analysis and Reduction for Big Scientific Data (https://drbsd.github.io/). To untar the file, run `tar -xvzf outer.tgz`. The Time Projection Chamber (TPC) is a hollow cylinder. Along the radial dimension, the TPC is composed of 48 cylindrical layers of small sensors, which are grouped into three layer groups: inner, middle, and outer. Each layer group has 16 consecutive layers. On each TPC layer, the voxels are presented as a rectangular grid with rows along the z (or horizontal) direction and columns along the azimuthal direction. Within one layer group, all layers have the same number of rows and columns. This allows us to represent the ADC values from one layer group as a 3D array. The data released here focuses on the outer layer group, where the array of ADC values has shape (16, 2304, 498) in the radial, azimuthal, and horizontal orders. The full voxel data are divided into 24 equal-size non-overlapping sections: 12 along the azimuthal direction (30 degrees per section) and 2 along the horizontal direction (divided by the transverse plane passing the collision point). We call one such section a TPC wedge. The array of ADC values from each TPC wedge in the outer layer has shape (16, 192, 249), listed in radial, azimuthal, and horizontal directions, respectively. The TPC wedges are used as the direct input to the deep neural network compression algorithms. We simulated 1310 events for central sqrt[s]=200 GeV Au-Au collisions with 170kHz pile-up. The data were generated with the HIJING event generator and Geant4 Monte Carlo detector simulation package integrated with the sPHENIX software framework. The simulated TPC readout (ADC values) from these events are represented in a 10-bit unsigned integer in [0, 1023]. To reduce unnecessary data transmission between detector pixels and front-end electronics, a zero-suppression algorithm has been applied. ADC values below 64 are suppressed to zero as most of them are noise. The zero compression makes the TPC data sparse at about 10% nonzero occupancy. We divide the 1310 total events into 1048 events for training and 262 for testing. Each event contains 24 outer-layer wedges. Thus, the training partition contains 25152 TPC outer-layer wedges, while the testing portion has 6288 wedges. The compression algorithm compresses each wedge independently. The dataset has the following structure: 24 subfolders with the name `12-2_[azimuthal section]-[horizontal section]` where the [azimuthal section] is labeled by an integer in [0, 11] and the [horizontal section] is labeled by either 0 or 1. Each file in one of the subfolders has the name in the format "AuAu200_170kHz_10C_Iter2_[simulation id].xml_TPCMLDataInterface_[event id within simulation].npy". There are 131 simulations, and each simulation contains 10 independent events (and hence the 1310 total events as mentioned above). Each [event id within simulation] is an integer in [0, 9]. train.txt: a list of all TPC wedges for the training split. text.txt: a list of all TPC wedges for the test split. Note that the dataset is split by events. That is, if a TPC wedge from an event is in the train split, all 24 wedges from the same event will all be in the train split. The same holds for the test split.
创建时间:
2023-10-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作