It is worth noting that only a small fraction of the available data has been used for all these and measurements except the lepton + jets channel.
3.1. Inclusive Cross-Section Measurement
In ATLAS, at = 13 TeV, the cross-section measurements were performed in the channel within the at least three b jet visible phase space and in lepton + jets channels within the at least four b jet visible phase space. To extract the + HF number of events, in both channels, a binned maximum likelihood fit is used on observables discriminating between signal and background. A combined template is created from the sum of all backgrounds. Three templates of , and events are created from all of , in association with a vector boson () and simulations as those samples contain the signal process. In the channel, and are merged together to fit to the distribution of the third highest b-tagging discriminant among the reconstructed jets in the event. The scale factors obtained from the fit are 1.33 ± 0.06 for the number of events and 1.05 ± 0.04 for the number of the combined + events. In the lepton + jets channel, all three templates are used to fit to the 2D histograms of the third and fourth b-tagging discriminant. The best fit values are 1.11 ± 0.2 for the number of events, 1.59 ± 0.06 for the number of events and 0.962 ± 0.003 for the number of events. The measured cross-section values for for both channels are compatible with each other.
To facilitate the comparison with the theory
cross-section, the
and
processes are also subtracted from the measured cross-section. The measured inclusive cross-sections are shown in
Figure 2. All of the inclusive cross-sections measured at
= 13 TeV in the visible phase space by the ATLAS experiment are summarized in
Table 2. The cross-section measurement for the ≥three b jet phase space in the
channel has an uncertainty of 13%, which is the most precise measurement. The uncertainties are dominated by systematic uncertainties mainly from the
modeling and b-tagging, as well as the jet energy scale.
The ratio measurement of the cross-sections of
to
production is also available using data collected at
= 8 TeV [
25]. The ratio measurement is motivated to reduce the systematic uncertainties and the result is compared with predictions in
Figure 3.
In CMS, the inclusive
cross-sections are measured in the different phase spaces of the dilepton, lepton + jets and hadronic channels using data collected at
= 13 TeV by CMS. In the dilepton channel, measurements at
= 8 TeV are also available. In the dilepton channel, the final state consists of two reconstructed leptons and at least four reconstructed b jets. With these two leptons, the dominating Z + jets background is estimated from data using control samples enriched in Z boson events. Among the at least four b jets, the first and the second jets in decreasing order of the b tagging discriminator tend to be the b jets from the top quark. Therefore, jets with the third and fourth largest b tagging discriminator are considered as the additional b jets. Using the two-dimensional distribution of these discriminators of two determined additional jets, the number of
events is extracted. Together with the ratio
, the cross-sections
and
are measured in the visible phase space. For the purpose of comparing the measurements with the theoretical prediction and with measurements in the other decay modes, the cross-sections in the full phase space are obtained by taking into account the acceptance,
, where
is the acceptance, defined as the number of events in the corresponding visible phase space divided by the number of events in the full phase space. The results for the full phase space are shown in
Figure 4 (upper).
In the lepton + jets channel, the measurement was conducted with data corresponding to an integrated luminosity of 35.9 fb
−1 at
= 13 TeV in CMS. In this channel, the identification of the origin of the jets is challenging because the final state with at least six jets including four b jets leads to ambiguities in the jet assignment. Moreover, the heavy-flavor jet can also originate from the W boson decay. In order to address this, the kinematic reconstruction method is used to identify the additional b jets. The algorithm assigns a
value according to the goodness of fit of each jet permutation to meet certain kinematic constraints. The solution selected is the one with the lowest
value. Once a jet topology is selected, the additional jets in the event are arranged in decreasing order of their b tagging discriminant value. Then, similar to the dilepton channel, only the information from two additional jets with the highest b tagging discriminant value is used to extract the
cross-section. The results for the ratio
,
and
are presented for both the visible phase space and the full phase space (see
Figure 4). Recently, the measurement in the lepton + jets channel was updated with a full Run 2 data corresponding to an integrated luminosity of 138 fb
−1 [
33]. In this analysis, the cross-sections in four different visible phase spaces are measured extensively in four different phase spaces. The final states of each phase space are shown in
Table 1. For the phase spaces with the requirement of three additional light jets, it is motivated for the study of additional QCD radiation in
or
events as these have been shown to be sensitive to the modeling of
production. The measured cross-sections in all phase spaces are larger than the predictions from the P
owheg +
pythia 8. All other predicted values in each phase space are available in Ref. [
33].
In the hadronic channel, the multi-jet process is the main background. To remove the multi-jet events, the quark–gluon discriminant was used. The unsupervised learning algorithm was also further used to maximize the contribution of
events. The measured cross-sections follow two definitions of the
events in the fiducial phase space. One is based exclusively on stable generated particles after hadronization (parton-independent). This definition facilitates comparisons with predictions from event generators. The other uses parton-level information after radiation emission (parton-based). This definition is closer to the approach taken by searches for
production to define the contribution from the
process. To address the large combinatorial ambiguity in identifying the additional jets in the events, a boosted decision tree (BDT) was used. The cross-section is also reported for the total phase space by correcting the parton-based fiducial cross-section by the experimental acceptance. The results are presented in
Figure 5.
The cross-section of a top quark pair production with an additional pair of c jets has been measured for the first time by CMS. This measurement is challenging as the experimental signature of a b jet is very similar to that of a c jet. Two additional jets are selected using a deep neural network classifier. To separate the
,
and
events, a NN is trained using charm jet tagging information of the first and second additional jets, and kinematic variables such as the angular separation
between two additional jets, as well as the NN score for the best jet permutation. This NN predicts the probabilities for five output classes of
,
,
,
and
. Two discriminators are derived as follows.
The
,
and
cross-sections are extracted from a fit to the two-dimensional distribution of these discriminators. The ratios
and
of, respectively, the measured
and
cross-sections with respect to the inclusive
+ two jets cross-section were also measured. The results are compared to theoretical predictions of either the P
owheg or M
adG
raph 5_aMC@NLO generators as shown in
Figure 6.
All of the inclusive cross-sections measured in the visible phase space by the CMS experiment are summarized in
Table 3 and
Table 4, and for the full phase space in
Table 5 and
Table 6.
Figure 7 also shows the comparison between the measured values in the full phase space and various theoretical predictions in CMS.
3.2. Differential Cross-Section Measurements
In addition to the inclusive cross-section measurements, the differential measurements of the + HF production cross-sections can also provide information on the perturbative QCD (pQCD) and enable the searches for potential new physics. The differential cross-section measurements have been performed at = 7, 8 and 13 TeV with the ATLAS experiment and at = 8 and 13 TeV with the CMS experiment.
To measure the differential cross-sections, the measured distributions at the detector level need to be unfolded to the generator level where the detector effect is removed so that the resulting cross-section can be compared with theory predictions and results from other experiments. At the generator level, it is not trivial to define the additional b jets in the process as we have b jets from the top quark decay. Moreover, the b jet could also emerge from the W boson decay. The additional b jets are expected to come from the gluon decay and can also come from the decay of the H boson or another boson.
In the b jet identification, there is a clear difference between the two experiments in ATLAS and CMS. In the ATLAS experiment, at the particle level, there is no attempt to identify the origin of the b jets relying on the simulation information. At this particle level, the two b jets with the highest or the smallest are selected for the differential cross-section measurement. The highest jets are considered as the b jets from the top quark while the b jets with the smallest are considered as the additional b jets not from the top quark decay to make use of the fact that the b jets from a gluon splitting tend to be collinear. While in CMS, the origin of the b jets is explicitly identified using the simulation information. For example, the b hadron is traced back through its ancestors in the simulation chain. In this way, only if the b jet is not from a top quark, the b jet is identified as one of the two additional b jets.
For the ATLAS measurements, the unfolded results are presented as normalized differential cross-sections in visible phase space as a function of the b jet multiplicity, global event properties and various kinematic variables. The measurements are conducted in the channel with at least three reconstructed b jets and in the lepton + jets channel with at least four b jets. The sample with at least four b jets in the lepton + jets channel has high signal purity resulting in a measurement with smaller dependence on the simulation. The channel benefits from an order of magnitude of a larger sample size containing at least three b jets.
Once the reconstructed level distributions of
+ HF events are extracted, then the measured distributions are unfolded to the particle level. The detector resolution effect and inefficiency are corrected by inverting the migration matrix which is optimized for a diagonal matrix. An iterative Bayesian unfolding technique [
49] implemented in the R
ooU
nfold software package [
50] is used in this process. Detector efficiencies and acceptance are then corrected using a bin-by-bin method.
Figure 8 shows the normalized cross-section as a function of the b jet multiplicity compared with predictions from various generator set-ups. The first three panels show the ratios of various predictions to data. The last panel shows the ratio of predictions of normalized differential cross-sections from M
adG
raph 5_aMC@NLO+
pythia 8, including or not the contributions from the
and
processes. All predictions relying on the parton shower generation of jets for high multiplicities are lower compared to the measurements. This suggests that the b jet production by the parton shower is not optimal in these processes. The comparison of the predictions from various generators with the measurements are made after subtracting the simulation-estimated contributions of
and
production from the data. The impact of including these processes in the prediction increases with b jet multiplicity, resulting in a change of about 10% relative to the QCD
prediction alone in the inclusive four b jet bin. The measurement in the
channel with at least three b jets tends to be more precise than in the lepton + jets channel with at least four b jets.
It is also of importance to verify the distributions of the
, the mass and the angular distance
of the two b jets where the
system is built from the two highest-
b jets and the two closest b jets in
. The measured distributions of those three variables in the lepton + jets channel are shown in
Figure 9,
Figure 10 and
Figure 11. The differential cross-section as a function of the
of the
system is measured with a precision of 10–15% over the full range in the
channel and with an uncertainty of 20–25% in the lepton + jets channel. In general, the differential distributions are well described by the different theoretical predictions, which vary significantly less compared to the size of the experimental uncertainty. All other distributions such as
or
of additional b jets are available in Ref. [
26].
In CMS, the differential cross-sections are measured in the visible phase space as a function of various kinematic properties such as the and of the leading and subleading additional b jets, the angular distance between them and the invariant mass of the two additional b jets. In particular, the differential cross-sections as a function of the and are of interest as the two additional b jets from a gluon tend to be produced collinearly and those from the H boson have the resonance peak at 125 GeV.
At the reconstruction level, it is very challenging to identify two additional b jets because there are four b jets from top quarks and a gluon splitting. To select the additional b jets, the multivariate approach of a BDT was used to maximize the correct assignment of additional b jets. The input variables to the BDT combine information from the two final-state leptons, the jets and . A total of twelve variables, e.g., the sum and difference of the invariant mass of the and system, ; the absolute difference in the azimuthal angle between them, ; the of the and system, and and the difference between the invariant mass of the two b jets and two leptons and the invariant mass of the pair, , are used as input variables. The variables insensitive to the additional radiation are selected to avoid any dependence on the kinematics of the additional jets. The jets from the system are identified as the pair with the highest BDT discriminant. From the remaining jets, those b-tagged jets with the highest are selected as being the leading additional ones. With this method, the correct assignment rate for the additional b jets in events is around 40%.
A template fit to the b-tagged jet multiplicity distribution is performed to improve the data and simulation comparison. For the differential cross-section measurements, effects from detector efficiency and resolution are corrected by using the regularized inversion of the response matrix which is calculated from simulated
events. The measured differential cross-sections as a function of the leading and subleading additional b jet
, the
and invariant mass of two additional b jets are shown in
Figure 12 for CMS. Measured cross-sections are compared with various theoretical predictions. The shape of the
distributions are well described by prediction. However, the measured values by CMS have larger uncertainties due to the use of a smaller data sample with respect to ATLAS.
In CMS, the differential cross-sections are measured with a full Run 2 data in the lepton + jets channel. In this analysis, two approaches are used to identify the additional b jets from the gluon splitting, while two b jets with the smallest angular separation are selected to reduce the systematic uncertainty on theory dependence, a multivariate algorithm based on a deep neural network (DNN) is also used to identify additional b jets not from top quarks by using the MC information.
To find the correct pair of b jets not from top quarks, only four b jets in the highest
order are used as candidate jets, which results in the six possible candidate jet combinations. The DNN makes use of two sets of input variables, targeting jet-specific input information and global event information separately. For jet-specific input information, the input variables consist of the
,
, a flag indicating whether it passes the tight b tagging working point, the angular separation (
) with the charged lepton and the invariant mass with the charged lepton. These input variables are connected via five convolutional network layers (CNN) [
51] followed by a long short-term memory (LSTM) cell [
52]. For the global event information, the input variables consist of the scalar
sum of the four candidate b jets, the
,
,
of the charged lepton, the
,
and invariant mass of the dijet combinations, the
R of the dijet combinations and the charged lepton as well as the jet and b-tagged jet multiplicities. These input variables are connected to three dense network layers with 50 nodes each. Both of these sequences are concatenated at the end into one dense layer with 10 nodes, which is connected to an output layer consisting of six nodes, each representing one of the six possible candidate jet combinations. The pair of b-tagged jets with the highest DNN output value per event is chosen as the correct assignment of the additional b jet pair and used further for the differential cross-section measurement.
The correct assignment of additional b jets in the DNN is about 49%, which represents a significant increase compared to choosing the two b jets closest in
, which only yields about 41%. The measured differential cross-sections as a function of the leading and subleading additional b jet
, the
and invariant mass of two additional b jets selected in the DNN are shown in
Figure 13. The distributions are not well described by P
owheg + H
erwig 7 (referred to as P
owheg +
h7 in
Figure 13). More differential variables are available in Ref. [
33].