PHITS simulations of neutron and gamma-ray production from and transport of 70–250 MeV protons in heterogeneous 1D tissue phantoms
收藏DataCite Commons2026-02-19 更新2026-05-05 收录
下载链接:
https://rodare.hzdr.de/record/3997
下载链接
链接失效反馈官方服务:
资源简介:
<strong>Introduction</strong>
This dataset corresponds to the PHITS simulation data used in "Fast Phase Space Reconstruction for Proton Beam Traversal and Neutron Emission in Proton Therapy using Fourier Neural Operators".<br>
A concise description of the simulation setup is provided here; please refer to the paper for detailed discussion, discription, analysis, and further results derived from this dataset.
<br>
<strong>Description of simulations</strong>
This dataset consists of PHITS simulations for 47 different proton energies from 70 MeV to 250 MeV incident upon different "1D" heterogeneous cylindrical phantoms (varied materials every 0.5 mm in length, uniform radially and rotationally) whose composition (materials and sequence along length) are taken from randomly sampled rays cast through a 3D CT phantom [CITE] with CT number mapped to material composition and density via the <em>HumanVoxelTable-KumamotoUniv.data</em> conversion table within the RT-PHITS utilitydistributed with PHITS. <br>
Included tallies score spatial distributions of energy deposition, LET, proton current (with an additional angular dimension), neutron production, gamma-ray production, and a variety of diagnostic tallies. <br>
Event-by-event "list-mode" data is scored for neutron and gamma-ray production, called "dump" tallies in PHITS.
Given the objective of these simulations was for AI model development, the 47 energies are divided into 37 <strong>training</strong> energies (70 MeV to 250 MeV in 5 MeV steps) and 10 <strong>testing</strong> energies (73 MeV to 245.8 MeV in 19.2 MeV steps). <br>
For each energy, two simulations were ran: (1) a simulation with <strong>1E8</strong> protons simulated where all <strong>tallies</strong> (including <strong>dump</strong> tallies) were included/enabled and (2) a simulation with <strong>1E9</strong> protons simulated (available on request) but with only <strong>dump</strong> tallies enabled (other tallies disabled to reduce memory consumption and increase simulation speed). <br>
Furthermore, all of the above was actually performed twice: (1) initially with purely <strong>monoenergetic</strong> beam energies and with a spatial spread of 2.5 mm and (2) a second "more realistic" set with <strong>Gaussian-distributed</strong> energies (with energy-dependent FWHM) and slightly wider 4.0 mm beam spread.
All simulation outputs were automatically processed from the plaintext and binary files produced by PHITS into compressed pickle file objects (NumPy arrays, Pandas DataFrames, dictionaries) using the PHITS Tools
Python utility. <br>
These Python objects were then utilized in the subsequent analysis of the paper this simulation set was generated for.
<br>
<strong>Structure of this repository</strong>
The volume of data present in this repository is quite substantial (~ 700 GB). <br>
Therefore, the repository has been structured in a way to allow flexibility in only downloading data of interest.
The root directory of this repository consists of 39 top-level directories whose names indicate their contents.<br>
Within each are two directories: <em>training</em> and <em>testing</em>.<br>
Within each of these are directories of the format <em>???_MeV</em>, where <em>???</em> is replaced by three digits specifying the nominal beam energy in MeV. <br>
(This is <em>???p?</em> for the energies of the testing dataset, with <em>p</em> in place of a decimal point.)<br>
Thus, each <em>training</em> directory contains 37 subdirectories, and each <em>testing</em> directory contains 10 subdirectories.<br>
(One should note that there are no setup differences between <em>training</em> and <em>testing</em> data; they are simply divided here in the same way as in the paper.)<br>
Each <em>???_MeV</em>/<em>???p?_MeV</em> directory contains simulation input/output and/or PHITS Tools processed output, depending on the top-level directory it is contained within.<br>
Input and output file names do not differ between different energies; directory structure is used to keep them distinguished/separated.
<strong>PHITS input information</strong>
One top-level directory differs from all of the others, and this is common_inputs.<br>
As the name suggests, this directory contains all PHITS input information used in generating all of the simulation outputs.
The core two PHITS input files used are <em>beam-on-target_phits-input_MonoE.inp</em> for the monoenergetic beam simulation set and <em>beam-on-target_phits-input_GaussE.inp</em> for the Gaussian-distributed beam energy simulation set.<br>
Within these inputs are lines using the PHITS insert file function <em>infl:{*}</em>; all inserted files used in the PHITS simulations are also contained within this <em>common_inputs</em> directory.<br>
The single exception to this is <em>PARAMETERS_files-1-and-7.txt</em>, which is simply the <em>file(1)</em> and <em>file(7)</em> PHITS <em>[Parameters]</em> arguments and will be system-specific paths to PHITS installation/data files.<br>
Also note that relative paths are used in the <em>infl:{*}</em> commands; these relative paths differ to how this repository is structured given the repository has been restructured in post for distribution convenience. <br>
File names are still unique and can be found in this <em>common_inputs</em> directory.<br>
The <em>CELL</em> subdirectory contains the <em>[Cell]</em> sections used for the varied phantom compositions, and the <em>MAPPINGS_OF_ENERGY_TO_CELL_FILES.csv</em> file details how these files are paired with the 47 different beam energies.
<strong>PHITS outputs (raw and processed)</strong>
The remaining 38 top-level directories contain simulation/processed output.<br>
When these simulations were ran, all output was contained in each <em>???_MeV</em> directory. <br>
As detailed earlier, these have been split into various top-level directories here to allow more convenient download of only desired files.<br>
Nominally, each of these <em>???_MeV</em> directories contained the following before being split:
a <em>beam-on-target_phits-input.inp</em> PHITS input file (and a simple <em>phits.in</em> pointing to this input file, needed for parallel running of PHITS); note that these inputs have all specific source energy information populated within this file
a <em>phantom_composition_info.csv</em> file also detailing the phantom composition used for that beam energy
<em>tphits*.out</em> file(s),<strong> raw</strong> summary output files generated by PHITS
<em>*.out</em> <strong>raw</strong> plaintext tally output files from PHITS
<em>*.eps</em> <strong>graphical</strong> visualizations of tally output, generated by PHITS
<em>*_dmp.out*</em> <strong>raw</strong> binary tally <strong>dump</strong> files from PHITS
<em>*.pickle.xz</em> <strong>processed</strong> tally output (and <em>phits.out</em> metadata) from PHITS Tools, LZMA-compressed pickle files
<em>*_dmp_namedtuple_list.pickle.xz</em> <strong>processed</strong> tally <strong>dump</strong> output from PHITS Tools, formatted as a NumPy record array (np.recarray)
<em>*_dmp_Pandas_df.pickle.xz</em> <strong>processed</strong> tally <strong>dump</strong> output from PHITS Tools, formatted as a Pandas DataFrame (same numerical data as in NumPy recarray)
<em>*.png</em> and <em>*.pdf</em> <strong>graphical</strong> visualizations of tally output, generated by PHITS Tools
The top-level directories of this repository are named in a way to detail (1) which simulations their contents pertain to and (2) which output files are contained within them.<br>
The directories are named using an underscore-delimited pattern whose components have the following names and meanings:
Beam type:
<strong>MonoE</strong> refers to simulations with the monoenergetic beams with 2.5 mm spread
<strong>GaussE</strong> refers to simulations with the Gaussian-distributed energies and 4.0 mm spread
Simulated number of protons:
<strong>1E8</strong> refers to simulations with 10<sup>8</sup> (one hundred million) protons simulated
<strong>1E9</strong> refers to simulations with 10<sup>9</sup> (one billion) protons simulated (only available on request)
Output source/type:
<strong>raw</strong> refers to the PHITS input and PHITS-generated output
<strong>processed</strong> refers to the Python-formatted processed output produced by PHITS Tools
<strong>plots</strong> refers to the <em>*.eps</em> files produced by PHITS and the <em>*.png</em> and <em>*.pdf</em> files produced by PHITS Tools, all containing graphical plots of tally output (only relevant to <strong>1E8</strong> simulations)
Other labels:
<strong>proton-tally</strong> refers to output from the huge <em>[T-Cross]</em> tally used only in <strong>1E8</strong> simulations for scoring proton phase space as a function of energy, position, and direction (separated from others owing to its considerable size)
<strong>neutron-dump</strong> refers to the event-by-event neutron production data scored by a <em>[T-Product]</em> tally's "dump" option
<strong>NumPy</strong> and <strong>Pandas</strong> to denote if <strong>processed</strong> contents are formatted as NumPy record arrays or Pandas Dataframes
<strong>gamma-dump</strong> refers to the event-by-event gamma-ray production data scored by a <em>[T-Product]</em> tally's "dump" option
<strong>NumPy</strong> and <strong>Pandas</strong> to denote if <strong>processed</strong> contents are formatted as NumPy record arrays or Pandas Dataframes
<strong>other</strong> refers to output from all other tallies aside from the above three (energy deposition, LET, diagnostic tallies, etc.; only relevant to <strong>1E8</strong> simulations given all tallies except dump tallies were disabled for <strong>1E9</strong> simulations) along with (for <strong>raw</strong> directories) PHITS input-related files and <em>phits*.out</em> file(s).
All put together, this results in the following top-level directories contained in this repository:
<em>common_inputs</em>
<em>GaussE_1E8_raw_proton-tally</em>
<em>GaussE_1E8_raw_neutron-dump</em>
<em>GaussE_1E8_raw_gamma-dump</em>
<em>GaussE_1E8_raw_other</em>
<em>GaussE_1E9_raw_neutron-dump <strong>(upon request)</strong></em>
<em>GaussE_1E9_raw_gamma-dump <strong>(upon request)</strong></em>
<em>GaussE_1E9_raw_other</em>
<em>GaussE_1E8_processed_proton-tally</em>
<em>GaussE_1E8_processed_neutron-dump_NumPy</em>
<em>GaussE_1E8_processed_neutron-dump_Pandas</em>
<em>GaussE_1E8_processed_gamma-dump_NumPy</em>
<em>GaussE_1E8_processed_gamma-dump_Pandas</em>
<em>GaussE_1E8_processed_other</em>
<em>GaussE_1E9_processed_neutron-dump_NumPy <strong>(upon request)</strong></em>
<em>GaussE_1E9_processed_neutron-dump_Pandas<strong> (upon request)</strong></em>
<em>GaussE_1E9_processed_gamma-dump_NumPy<strong> (upon request)</strong></em>
<em>GaussE_1E9_processed_gamma-dump_Pandas <strong>(upon request)</strong></em>
<em>GaussE_1E9_processed_other </em>
<em>GaussE_1E8_plots</em>
<em>MonoE_1E8_raw_proton-tally</em>
<em>MonoE_1E8_raw_neutron-dump</em>
<em>MonoE_1E8_raw_gamma-dump</em>
<em>MonoE_1E8_raw_other</em>
<em>MonoE_1E9_raw_neutron-dump<strong> (upon request)</strong></em>
<em>MonoE_1E9_raw_gamma-dump <strong>(upon request)</strong></em>
<em>MonoE_1E9_raw_other</em>
<em>MonoE_1E8_processed_proton-tally</em>
<em>MonoE_1E8_processed_neutron-dump_NumPy</em>
<em>MonoE_1E8_processed_neutron-dump_Pandas</em>
<em>MonoE_1E8_processed_gamma-dump_NumPy</em>
<em>MonoE_1E8_processed_gamma-dump_Pandas</em>
<em>MonoE_1E8_processed_other</em>
<em>MonoE_1E9_processed_neutron-dump_NumPy <strong>(upon request)</strong></em>
<em>MonoE_1E9_processed_neutron-dump_Pandas <strong>(upon request)</strong></em>
<em>MonoE_1E9_processed_gamma-dump_NumPy <strong>(upon request)</strong></em>
<em>MonoE_1E9_processed_gamma-dump_Pandas<strong> (upon request)</strong></em>
<em>MonoE_1E9_processed_other</em>
<em>MonoE_1E8_plots</em>
And, as stated earlier, each of these top-level directories is divided into a <em>training</em> subdirectory (containing 37 <em>???_MeV</em> directories) and a <em>testing</em> subdirectory (containing 10 <em>???p?_MeV </em>directories), where the <em>???[p?]_MeV</em> directories only (1) contain particular files (2) relevant to certain simulations&mdash;as specified by the top-level directory's name.
As a note to anyone surveying the <em>raw</em> files, all <em>GaussE</em> simulations were ran with OpenMP parallelization with 10 processes.<br>
For<em> 1E8 </em>simulations, this was conducted as ten PHITS runs of 1E7 protons each; for<em> 1E9 </em>simulations, this was conducted as twenty runs of 5E7 protons each.<br>
(PHITS runs can be "chained" as "restart calculations", where one run can resume from where a previous run ended.)<br>
In these simulations, the generated<em> phits.out</em> files from each run were renamed to <em>phits-#.out</em> (where <em>#</em> is the run number,<em> 0</em> to<em> 19</em>) and moved into a <em>phitsout</em> subdirectory after each run's completion.<br>
However, this was less uniform for the <em>MonoE</em> simulations; for those, the strategy was to complete each simulation in a single run of PHITS. <br>
This generally involved using a hybrid OpenMP + MPI parallelization with anywhere from 80 to 160 processes each, split between OMP and MPI (noting that some<em> 1E9 </em>runs were conducted with only MPI parallelization).<br>
None of this influences the output format of the standard tally outputs.<br>
However, the number of dump files produced is equal to the number of MPI processes utilized.<br>
This means that each <em>GaussE</em> simulation only has one dump file per dump tally owing to only using OpenMP parallelization (which merges its dump files at the end of calculation) while the<em> MonoE </em>simulations contain a varied number of dump files per dump tally owing to varriations in parallelization strategies employed in those simulations.<br>
PHITS Tools ultimately merges all dump outputs back together in its processing, meaning if looking at the <em>processed</em> output this quirk of how simulations were conducted should not be apparent at all.
<br>
Given PHITS Tools was under ongoing development as this dataset was being produced, the <em>GaussE</em> directories contain some extra output not present in the <em>MonoE</em> directories. Most notably, only for the <em>GaussE</em> simulations do the <em>plot</em> directories contain PNG and PDF plot files generated by PHITS Tools and the <em>*_processed_*</em> directories contain dictionary objects of the processed <em>phits*.out</em> files.
Note that, for convenience, the <em>phits*.out</em> file(s) for each simulation are also copied to all <em>*_raw_*</em> directories. The <em>phits*.out</em> file(s) contain the full PHITS input echo, among other information about the simulation. For the <em>GaussE</em> simulations, these are within a further <em>phitsout</em> subdirectory for each beam energy. Also for all <em>GaussE_*_processed_*</em> directories, the processed <em>phits*.out</em> file(s), <em>phits*_out.pickle.xz</em>, are included too.
<strong>References</strong>
<em>TO BE POPULATED</em>
<strong>Acknowledgements</strong>
The NOVO project has received funding from the European Innovation Council (EIC) under grant agreement No. 101130979. The EIC receives support from the European Union's Horizon Europe research and innovation programme. Partners from The University of Manchester has received funding from UK Research and Innovation under grant agreement No. 10102118
提供机构:
Rodare
创建时间:
2026-02-19



