five

Top quark pair events for heavy flavour tagging and vertexing at the LHC

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10371997
下载链接
链接失效反馈
官方服务:
资源简介:
This data contains jets from top anti-top decays. The event and parton shower are simulated in Pythia 8 with a centre of mass energy of 13 TeV, with detector response modelled in the Delphes framework. The detector response is modelled on the ATLAS detector, and a mean pileup of 50 was used. The dataset consists of jets, jet constituents, and truth heavy-flavour hadrons. For each jet, up to 50 charged constituents and 5 truth hadrons are included. Each constituent includes a link to the truth hadron associated, if such a link exists.  Provided are 5 files, which are detailed below class_dict.yaml - Details the relative weights for classification labels, based on the frequency of occurrence for a given entry. Labels are detailed below. norm_dict.yaml - Contains the means and standard deviations of variables that can be used for training, allowing for scaling. pp_output_train.h5 - 13.5 million training jets, consisting of 4.5 million b-jets, c-jets, and light-flavoured jets. Resampling is applied over the jet pT and eta, to ensure equivalent kinematic distributions pp_output_val.h5 - 1.35 million jets for validation, consisting of 450,000 of each jet flavour. Kinematics are resampled in the same way as the training file. pp_output_test_ttbar.h5 - 1.35 million jets for evaluation, consisting of 450,000 of each jet flavour, with no kinematic resampling applied. Each h5 file contains the following groups: Jets - (N,) - N jets, including variables such as jet kinematics, flavours, and summary statistics on the number of hadrons and constituents in the jet. Consts (N,50) - Up to 50 charged constituents per jet. Includes details on constituent kinematics and identification. A variable 'valid' is True for tracks in the jet, and False for all other tracks. The additional variable 'truth_hadron_idx' details which hadron (if any) a constituent is associated to. Hadrons (N, 5) - Up to 5 truth heavy-flavour hadrons per jet. Each hadron includes details on kinematics. The variable 'hadron_idx' represents an ID for the hadron, and matches to the constituent variable 'truth_hadron_idx'. Each group contains both variables that can be used in training, and truth labels, which are as follows: Jets flavour - Flavour ID of the jet, 5 for b-jets (containing at least 1 b-hadron within a 0.4 dR(jet, hadron) match), 4 for c-jets (no b-hadrons, and contains at least 1 c-hadron), 0 for light-flavoured jets (contains no b- or c-hadrons) Consts truth_hadron_idx - integer that refers to hadron that produced the constituent. '-1' for padded tracks, or tracks with no truth heavy flavour hadron (e.g, pileup, hadronisation). truth_vertex_idx - integer that refers to the vertex a constituent came from - if two tracks have an equivalent truth_vertex_idx, they originate from the same vertex. Hadrons hadron_idx - The idx of the hadron, value is '-1' for padded hadrons, 0-4 for remaining hadrons. If a constituent 'truth_hadron_idx' matches with this value, then the constituent came from this heavy flavour hadron decay. flavour - The flavour of the hadron. '5' if the hadron contains at least 1 b-quark, '4' if there is no b-quark but a c-quark is present, '-1' for padded hadrons pt, lxy, dr, mass - The transverse momentum (pt) [GeV], transverse distance between hadronic decay vertex and the primary vertex (lxy) [mm], dR(Jet, Hadron), and the hadron truth mass [GeV] This dataset allows for studies into algorithms that aim to perform vertex reconstruction.
创建时间:
2023-12-15
二维码
社区交流群
二维码
科研交流群
商业服务