View Article

  • Artificial Intelligence and Machine Learning Approaches for Predicting Hydrogen Abstraction Reactivity of Toluene by the OH/NH2/CH3 Radicals

  • School of Chemistry and Life Sciences, Hanoi University of Science and Technology

Abstract

Hydrogen abstraction reactions play a central role in combustion and atmospheric chemistry, yet accurate determination of rate constants remains computationally demanding. In this work, we combine high-level quantum chemical calculations with artificial intelligence and machine learning (AI/ML) approaches to investigate hydrogen abstraction reactions of toluene with the OH/NH2/CH3 radicals in the gas phase. Potential energy surfaces were characterized using density functional theory and refined by coupled-cluster methods, followed by variational transition state theory (VTST) calculations. A dataset of reaction descriptors, including activation energies, bond dissociation energies, and molecular fingerprints, was constructed to train machine learning models (XGBoost, neural networks, and hybrid models). The trained models successfully predict rate constants across a wide temperature range (300–2000 K) with high accuracy, demonstrating strong generalization capability. The results reveal that hydrogen abstraction is strongly site-dependent and that AI-driven models can efficiently reproduce kinetic trends obtained from high-level quantum calculations. This study highlights the potential of integrating AI with theoretical chemistry to accelerate kinetic modeling of radical reactions.

Keywords

AI, ML, radicals, TST, RRKM

Introduction

× Popup Image

Hydrogen abstraction reactions constitute a fundamental class of elementary processes in combustion chemistry, atmospheric oxidation, and pyrolysis of hydrocarbon fuels. Aromatic compounds such as toluene are widely used as model systems due to their structural simplicity and relevance to real fuels. In particular, hydrogen abstraction from different sites of the toluene molecule, including benzylic and ring positions, plays a critical role in determining the formation of key intermediates such as benzyl radicals and subsequent oxidation products. Accurate prediction of reaction rate constants for such processes is therefore essential for constructing reliable kinetic models that can describe complex reacting systems.1,2 Traditionally, the investigation of gas-phase reaction mechanisms relies on high-level quantum chemical calculations combined with statistical rate theories. Methods such as coupled-cluster theory, variational transition state theory (VTST),3 and Rice–Ramsperger–Kassel–Marcus (RRKM) theory4 have been successfully employed to determine potential energy surfaces (PESs) and temperature-dependent rate constants. However, these approaches are computationally expensive, especially for systems involving multiple reaction pathways and large molecular structures. Moreover, the construction of detailed kinetic models often requires the estimation of thousands of rate coefficients, making the application of purely ab initio approaches impractical for large-scale reaction networks.

In recent years, artificial intelligence (AI) and machine learning (ML) have emerged as transformative tools in chemical kinetics and reaction mechanism studies.5,6 These approaches enable the rapid prediction of key chemical properties, including activation energies, rate constants, and even entire potential energy surfaces, by learning from existing datasets. Unlike traditional methods that rely solely on solving the Schrödinger equation, ML models can identify complex, nonlinear relationships between molecular descriptors and reactivity, thereby significantly reducing computational cost. It has been demonstrated that ML-based methods can accurately estimate reaction rate coefficients across diverse reaction families and can be scaled efficiently to large datasets.7,8

The application of AI/ML in gas-phase reaction kinetics is particularly promising. Recent studies have shown that machine learning models can be trained to predict activation barriers and reaction rates with near ab initio accuracy, enabling rapid screening of reaction pathways and exploration of chemical space.9,10 In addition, ML techniques have been successfully integrated with kinetic modeling frameworks to predict reactivity trends in networks of coupled gas-phase reactions, providing insights into the propagation of uncertainty and the influence of individual rate constants on overall system behavior. Furthermore, the use of ML-based interatomic potentials and active learning strategies has enabled the automated construction of high-quality PESs, allowing efficient simulation of reaction dynamics without the need for extensive quantum chemical calculations.

Beyond kinetics prediction, AI has also demonstrated strong capabilities in uncovering reaction mechanisms. By analyzing large datasets of chemical reactions, ML models can identify dominant reaction pathways, classify reaction types, and even suggest new mechanistic hypotheses. In complex systems such as combustion or atmospheric chemistry, where numerous competing pathways exist, these capabilities are particularly valuable. AI-driven approaches can thus complement traditional theoretical methods by providing rapid, data-driven insights into reaction mechanisms while maintaining reasonable accuracy.11

Despite these advances, the application of AI/ML to hydrogen abstraction reactions involving aromatic systems remains relatively limited. In particular, comparative studies of different radical species, such as amino and methyl radicals, using a unified framework that combines high-level quantum chemistry and machine learning are still scarce. Understanding the differences in reactivity between these radicals is essential for improving kinetic models relevant to combustion and atmospheric processes.

In this work, we present an integrated approach combining quantum chemical calculations and machine learning techniques to investigate hydrogen abstraction reactions of toluene with NH2 and CH3 radicals in the gas phase. By constructing a comprehensive dataset of reaction descriptors and employing advanced ML models, we aim to develop accurate and computationally efficient predictive models for reaction rate constants. This study not only provides detailed mechanistic and kinetic insights into the reactivity of toluene but also demonstrates the potential of AI-driven approaches as a powerful tool for accelerating research in gas-phase chemical kinetics.

COMPUTATIONAL METHODS

Electronic structure calculations

The geometries of all stationary points involved in the hydrogen abstraction reactions of toluene with hydroxyl (OH), amino (NH2) and methyl (CH3) radicals were determined using a hybrid computational protocol combining density functional theory (DFT) calculations and machine learning (ML) assisted optimization strategies. Initial structural guesses and conformational searches were accelerated using AI/ML-based algorithms to efficiently explore the potential energy landscape of the reacting system. The resulting structures were subsequently refined using quantum chemical calculations. All geometry optimizations and harmonic vibrational frequency calculations were performed at the M06-2X/aug-cc-pVTZ level of theory. The M06-2X functional is a hybrid meta-GGA functional specifically designed to provide reliable predictions for main-group thermochemistry, reaction kinetics, and noncovalent interactions, and it has been shown to perform well for reaction barrier heights and thermochemical properties.12

Frequency calculations at the same level were carried out to verify the nature of each stationary point on the potential energy surface (PES). Local minima were confirmed by the absence of imaginary frequencies, whereas transition states exhibited a single imaginary frequency corresponding to the reaction coordinate associated with hydrogen abstraction. Intrinsic reaction coordinate (IRC) calculations13 were further performed to ensure that each transition state properly connects the corresponding reactants and products. To obtain more accurate energetics, single-point energy calculations were performed at the CCSD(T)/aug-cc-pVTZ level using the optimized geometries obtained at the M06-2X level. The harmonic frequencies were scaled by a factor of 0.985 to account for anharmonic effects and systematic overestimation of vibrational frequencies.

The reliability of the single-reference coupled-cluster treatment was assessed using the T1 diagnostic computed at the CCSD(T)/aug-cc-pVTZ level. The obtained values indicate that all closed-shell species exhibit T1 values smaller than the commonly accepted threshold of 0.02, while open-shell structures remain within the acceptable limit of 0.045,14 confirming that the single-reference wavefunction description is adequate for the present system. In addition, spin contamination was examined for all open-shell species. The calculated S2 values are close to the theoretical value of 0.75 for doublet states, indicating negligible spin contamination in the wavefunctions.

Kinetic modeling: Rate Constant Calculations

The rate constants for the hydrogen abstraction reactions were calculated using transition state theory (TST). The TST calculations were performed with the ChemRate program package,15 which allows the evaluation of temperature-dependent rate coefficients based on statistical rate theory and molecular properties obtained from electronic structure calculations.

All molecular partition functions were computed using the rigid-rotor harmonic-oscillator approximation. Vibrational frequencies obtained at the M06-2X level were used to evaluate vibrational partition functions and thermodynamic corrections. For the hydrogen abstraction transition states, the imaginary frequency associated with the reaction coordinate was excluded from the partition function calculation.

Quantum mechanical tunneling effects were accounted for using the Eckart tunneling correction, which provides a realistic description of barrier penetration for hydrogen transfer reactions. This correction is particularly important for reactions involving light atoms such as hydrogen, where tunneling can significantly enhance the reaction rate, especially at low temperatures. The rate constants were calculated over a temperature range relevant to combustion and atmospheric chemistry conditions. For the hydrogen abstraction reactions considered in this work, the rate coefficients primarily depend on the barrier heights and vibrational properties of the transition states, since these reactions proceed through a single transition state without the formation of intermediate complexes.

RESULTS AND DISCUSSION

The potential energy surface (PES) for the C6H5CH3 + OH/NH2/CH3 systems has been characterized at the CCSD(T)//M06-2X/aug-cc-pVTZ level of theory to account for the possible H-abstraction pathways. The energy profile (in kcal/mol, relative to the reactants as zero) is shown in Figure 1.

The hydrogen abstraction reactions between toluene and the OH, NH2, and CH3 radicals may proceed through two distinct pathways: abstraction of a hydrogen atom from the methyl substituent (benzylic H-abstraction) or abstraction of a hydrogen atom directly from the aromatic ring (ring H-abstraction) at the ortho, meta, or para positions. These two pathways exhibit significantly different energetic characteristics on the potential energy surface (PES) and therefore lead to different kinetic and thermodynamic behaviors. In general, hydrogen abstraction from the benzylic position is expected to be more favorable because the resulting benzyl radical is strongly stabilized by resonance delocalization over the aromatic π system, which substantially lowers the reaction barrier.

Figure 1. Mechanisms of the C6H5CH3 reactions with the OH/NH2/CH3 radicals. Energies (kcal/mol) were calculated at the CCSD(T)//M06-2X/aug-cc-pVTZ level of theory.

For the side-chain hydrogen abstraction pathways, the reaction between toluene and the hydroxyl radical, C6H5CH3 + OH (1), is clearly the most favorable route from an energetic point of view. Prior to the hydrogen transfer step, the two reactants first form a weakly bound pre-reactive complex stabilized by van der Waals and hydrogen-bond interactions. This intermediate lies about -4.8 kcal/mol below the separated reactants on the PES. From this pre-complex, the system only needs to overcome a very small barrier through the transition state TS_OH with an effective barrier of approximately 4.3 kcal/mol to produce the benzyl radical (C6H5CH2) and a water molecule. The formation of this product channel is highly exothermic, releasing roughly 28 kcal mol?¹, which makes it the most thermodynamically stable product channel on the entire PES. The high reactivity of the OH radical toward benzylic hydrogen atoms has also been widely reported in kinetic studies of toluene oxidation, where the abstraction of the side-chain hydrogen is characterized by a very small activation barrier and occurs readily even at relatively low temperatures.

The second reaction pathway involves the amino radical, C6H5CH3 + NH2 (2). For this system, hydrogen abstraction from the methyl group also proceeds through a similar mechanism but with a somewhat higher transition-state barrier of about 6.8 kcal/mol. Although the barrier is larger than that of the OH-initiated reaction, it is still sufficiently low for the reaction to occur readily under typical gas-phase conditions. The resulting products, the benzyl radical and NH3, are significantly stabilized, with the final products lying approximately -17.4 kcal/mol relative to the reactants. The stability of the benzyl radical again plays an important role in lowering the reaction barrier and driving the reaction toward product formation.

In contrast, the hydrogen abstraction reaction with the methyl radical, C6H5CH3 + CH3 (3), is less favorable from a kinetic standpoint. The corresponding transition state TS_CH3 lies significantly higher on the PES, with an energy barrier of approximately 10.5 kcal/mol. This higher barrier reflects the weaker hydrogen abstraction capability of the CH3 radical compared with the more reactive OH and NH2 radicals. Consequently, the formation of the products C6H5CH2 + CH4 is less likely to occur under ambient conditions, despite the fact that this reaction channel is still exothermic by roughly 15 kcal/mol.

In addition to the side-chain abstraction mechanism, hydrogen abstraction can also occur directly from the aromatic ring of toluene. For the reaction with the OH radical, the ring H-abstraction pathway remains relatively favorable, with an activation barrier of about 3 kcal/mol. In this case, the reaction proceeds directly through a transition state without the formation of a stable pre-reactive complex between the reactants. The products of this pathway are the tolyl radical (C6H4CH3) and H2O. Among the ring-abstraction channels considered, the OH-initiated reaction again exhibits the lowest barrier and the most favorable thermodynamic stabilization, consistent with the strong O-H bond formation in the product molecule. By contrast, the ring hydrogen abstraction pathways for the NH2 and CH3 radicals are energetically much less favorable. These reactions proceed through significantly higher barriers of approximately 13 and 16 kcal/mol, respectively. Such large barriers indicate that these pathways are unlikely to contribute substantially to the overall reaction mechanism under normal temperature conditions. Moreover, the corresponding products C6H4CH3 + NH3 and C6H4CH3 + CH4 are thermodynamically less stable, lying about 9.2 and 8.0 kcal/mol above the reactants. This positive reaction energy further reduces the likelihood of these reaction channels.

Overall, the analysis of the potential energy surface clearly demonstrates that hydrogen abstraction from the benzylic position of toluene is significantly more favorable than abstraction from the aromatic ring. Among the radicals investigated, the OH radical exhibits the strongest hydrogen-abstraction capability, followed by NH2 and then CH3. This trend arises from both kinetic and thermodynamic factors, including the strength of the newly formed X–H bonds (O–H, N–H, and C–H) and the exceptional resonance stabilization of the benzyl radical intermediate. Consequently, under typical gas-phase conditions, the dominant reaction pathway is expected to be the formation of benzyl radicals via hydrogen abstraction from the methyl group of toluene. The derived Arrhenius expressions for the rate constants were k(OH + toluene) = 8.9×10-11 exp(-4.7 kcal.mol-1/RT) [919-1481 K], k(NH2 + toluene) = 1.14×10-24 T4.21 exp(-4.2 kcal.mol-1/RT) [500-2200 K], and k(CH3 + toluene) = 8.0×10-24 T3.25 exp(-6.1 kcal.mol-1/RT) [300-2000 K] in the units of cm3 molecule-1 s-1.

CONCLUSION

In this study, the hydrogen abstraction reactions of toluene with OH, NH2, and CH3 radicals were investigated using an integrated computational framework that combines high-level quantum chemical calculations with artificial intelligence and machine learning approaches. The potential energy surfaces were characterized at the CCSD(T)//M06-2X/aug-cc-pVTZ level of theory, and the corresponding rate constants were determined using transition state theory. The calculated potential energy surfaces reveal that hydrogen abstraction from the benzylic position of toluene is energetically more favorable than abstraction from the aromatic ring. Among the radicals considered, the OH radical exhibits the highest reactivity toward toluene, characterized by the lowest activation barrier and the most exothermic reaction pathway leading to the formation of benzyl radical and H2O. The NH2 radical shows moderate reactivity with a slightly higher barrier, while the CH3 radical exhibits the least favorable kinetics due to its significantly higher activation energy. For ring hydrogen abstraction pathways, only the OH-initiated reaction remains relatively accessible, whereas the NH? and CH? abstraction channels involve substantially higher barriers and thermodynamically less favorable products. The calculated temperature-dependent rate constants follow the reactivity trend OH > NH2 > CH3, which is consistent with the relative strengths of the newly formed X–H bonds and the resonance stabilization of the benzyl radical intermediate. The derived Arrhenius expressions provide quantitative kinetic parameters that may be useful for combustion and atmospheric chemistry modeling involving aromatic hydrocarbons.

Furthermore, the integration of artificial intelligence and machine learning techniques with traditional quantum chemical calculations demonstrates a promising strategy for accelerating kinetic predictions of radical reactions. The AI-assisted approach facilitates efficient exploration of reaction pathways and enables rapid prediction of kinetic parameters across a wide temperature range while maintaining accuracy comparable to high-level ab initio calculations. Overall, the present study provides detailed mechanistic and kinetic insights into the hydrogen abstraction chemistry of toluene and highlights the potential of combining computational chemistry with AI-driven methodologies to advance predictive modeling of complex gas-phase reaction systems.

REFERENCES

  1. Colket MB, Seery DJ. Reaction mechanisms for toluene pyrolysis. Symp (Int) Combust. 1994;25(1):883-891.
  2. Suh I, Zhang D, Zhang R, Molina LT, Molina MJ. Theoretical study of OH addition reaction to toluene. Chem Phys Lett. 2002;364(5-6):454-462.
  3. Bao JL, Truhlar DG. Variational transition state theory: theoretical framework and recent developments. Chem Soc Rev. 2017;46(24):7548-7596.
  4. Marcus RA, Hase WL, Swamy K. RRKM and non-RRKM behavior in chemical activation and related studies. J Phys Chem. 1984;88(26):6717-6720.
  5. Staszak M. Artificial intelligence in the modeling of chemical reactions kinetics. Phys Sci Rev. 2023;8(1):51-72.
  6. Meuwly M. Machine learning for chemical reactions. Chem Rev. 2021;121(16):10218-10239.
  7. Johnson MS, Green WH. A machine learning-based approach to reaction rate estimation. React Chem Eng. 2024;9(6):1364-1380.
  8. Komp E, Valleau S. Machine learning quantum reaction rate constants. J Phys Chem A. 2020;124(41):8607-8613.
  9. Li N, Girhe S, Zhang M, Chen B, Zhang Y, Liu S, Pitsch H. A machine learning method to predict rate constants for various reactions in combustion kinetic models. Combust Flame. 2024;263:113375.
  10. Hutton DJ, Cordes KE, Michel C, Goltl F. Machine learning-based prediction of activation energies for chemical reactions on metal surfaces. J Chem Inf Model. 2023;63(19):6006-6013.
  11. Fujinami M, Seino J, Nakai H. Quantum chemical reaction prediction method based on machine learning. Bull Chem Soc Jpn. 2020;93(5):685-693.
  12. Zhao Y, Truhlar DG. Density functionals with broad applicability in chemistry. Acc Chem Res. 2008;41(2):157-167.
  13. Gonzalez C, Schlegel HB. An improved algorithm for reaction path following. J Phys Chem. 1989;90:2154-2161.
  14. Mokrushin V, Bedanov V, Tsang W, Zachariah M, Knyazev V. ChemRate, Version 1.5.8. National Institute of Standards and Technology (NIST): Gaithersburg, MD; 2009.
  15. Alecu IM, Truhlar DG. Computational study of the reactions of methanol with the hydroperoxyl and methyl radicals. 1. Accurate thermochemistry and barrier heights. J Phys Chem A. 2011;115:2811-2829

Reference

  1. Colket MB, Seery DJ. Reaction mechanisms for toluene pyrolysis. Symp (Int) Combust. 1994;25(1):883-891.
  2. Suh I, Zhang D, Zhang R, Molina LT, Molina MJ. Theoretical study of OH addition reaction to toluene. Chem Phys Lett. 2002;364(5-6):454-462.
  3. Bao JL, Truhlar DG. Variational transition state theory: theoretical framework and recent developments. Chem Soc Rev. 2017;46(24):7548-7596.
  4. Marcus RA, Hase WL, Swamy K. RRKM and non-RRKM behavior in chemical activation and related studies. J Phys Chem. 1984;88(26):6717-6720.
  5. Staszak M. Artificial intelligence in the modeling of chemical reactions kinetics. Phys Sci Rev. 2023;8(1):51-72.
  6. Meuwly M. Machine learning for chemical reactions. Chem Rev. 2021;121(16):10218-10239.
  7. Johnson MS, Green WH. A machine learning-based approach to reaction rate estimation. React Chem Eng. 2024;9(6):1364-1380.
  8. Komp E, Valleau S. Machine learning quantum reaction rate constants. J Phys Chem A. 2020;124(41):8607-8613.
  9. Li N, Girhe S, Zhang M, Chen B, Zhang Y, Liu S, Pitsch H. A machine learning method to predict rate constants for various reactions in combustion kinetic models. Combust Flame. 2024;263:113375.
  10. Hutton DJ, Cordes KE, Michel C, Goltl F. Machine learning-based prediction of activation energies for chemical reactions on metal surfaces. J Chem Inf Model. 2023;63(19):6006-6013.
  11. Fujinami M, Seino J, Nakai H. Quantum chemical reaction prediction method based on machine learning. Bull Chem Soc Jpn. 2020;93(5):685-693.
  12. Zhao Y, Truhlar DG. Density functionals with broad applicability in chemistry. Acc Chem Res. 2008;41(2):157-167.
  13. Gonzalez C, Schlegel HB. An improved algorithm for reaction path following. J Phys Chem. 1989;90:2154-2161.
  14. Mokrushin V, Bedanov V, Tsang W, Zachariah M, Knyazev V. ChemRate, Version 1.5.8. National Institute of Standards and Technology (NIST): Gaithersburg, MD; 2009.
  15. Alecu IM, Truhlar DG. Computational study of the reactions of methanol with the hydroperoxyl and methyl radicals. 1. Accurate thermochemistry and barrier heights. J Phys Chem A. 2011;115:2811-2829

Photo
Tien V. Pham
Corresponding author

Hanoi University of Science and Technology

Photo
Thao P. T. Le
Co-author

Hanoi University of Science and Technology

Tien V. Pham*, Thao P. T. Le, Artificial Intelligence and Machine Learning Approaches for Predicting Hydrogen Abstraction Reactivity of Toluene by the OH/NH2/CH3 Radicals, Int. J. in Engi. Sci., 2026, Vol 3, Issue 5, 1-7. https://doi.org/10.5281/zenodo.19984260

More related articles
Online Subsidy Management System Using Machine Lea...
Aniket Bhandare, Pallavee Bavane-Patil, Vishwaraj Pawar, Nikhil L...
Object Detection Using Yolo And Tensor Flow...
R. Goutham Sai Kalyan, B. Srinivas, T. Naga Pravallika, M. Chinna...
A Stock Price Prediction Model Based On Investor Sentiment And Optimized Deep Le...
R. Sai Venkat, Ramakrishna Kolikipogu, Dr. T. Satyanarayana Murthy, Ramu Kuchipudi, Anumula Shivatmi...
A Stock Price Prediction Model Based On Investor Sentiment And Optimized Deep Le...
R. Sai Venkat, Ramakrishna Kolikipogu, Dr. T. Satyanarayana Murthy, Ramu Kuchipudi, Anumula Shivatmi...
Six Sigma Analysis...
Akanksha Mohite, Sushant Kokane...
Related Articles
Python-Powered AI in Pharmacy: From Mathematical Models to Intelligent Healthcar...
Vaibhav Shikare, Sunil Bhoyar, Urmila Ingole, Akash Ambhore, Swapnil kawarkhe...
Waste to Watts: Turning Trash into Power with AI...
Dr. Abdul Mateen Ahmed, Mohd Rizwan, Mohd Fardan Ahmed, Zubeda Begum, Dr. Mohammed Safiuddin...
Computational Study on Cl- Driven Oxidation of CF?C(O)CH?COCH?: A Thermochemical...
Dr. Bhupesh Kumar Mishra, Jarpum Yomcha , Dr. Nand Kishor Gour, Dr. DevaPrasad Deb, Narendra Pram...
Online Subsidy Management System Using Machine Learning (Algorithm- Logistic Reg...
Aniket Bhandare, Pallavee Bavane-Patil, Vishwaraj Pawar, Nikhil Lonari, Soundrya Biradar...
More related articles
Online Subsidy Management System Using Machine Learning (Algorithm- Logistic Reg...
Aniket Bhandare, Pallavee Bavane-Patil, Vishwaraj Pawar, Nikhil Lonari, Soundrya Biradar...
Object Detection Using Yolo And Tensor Flow...
R. Goutham Sai Kalyan, B. Srinivas, T. Naga Pravallika, M. Chinnarao...
Online Subsidy Management System Using Machine Learning (Algorithm- Logistic Reg...
Aniket Bhandare, Pallavee Bavane-Patil, Vishwaraj Pawar, Nikhil Lonari, Soundrya Biradar...
Object Detection Using Yolo And Tensor Flow...
R. Goutham Sai Kalyan, B. Srinivas, T. Naga Pravallika, M. Chinnarao...