Automatic Feature Selection for Atom-Centered Neural Network Potentials Using a Gradient Boosting Decision Algorithm

Atom-centered neural network (ANN) potentials have shown high accuracy and computational efficiency in modeling atomic systems. A crucial step in developing reliable ANN potentials is the proper selection of atom-centered symmetry functions (ACSFs), also known as atomic features, to describe atomic...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemical theory and computation Vol. 20; no. 23; pp. 10564 - 10573
Main Authors Li, Renzhe, Wang, Jiaqi, Singh, Akksay, Li, Bai, Song, Zichen, Zhou, Chuan, Li, Lei
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 10.12.2024
Subjects
Online AccessGet full text
ISSN1549-9618
1549-9626
1549-9626
DOI10.1021/acs.jctc.4c01176

Cover

More Information
Summary:Atom-centered neural network (ANN) potentials have shown high accuracy and computational efficiency in modeling atomic systems. A crucial step in developing reliable ANN potentials is the proper selection of atom-centered symmetry functions (ACSFs), also known as atomic features, to describe atomic environments. Inappropriate selection of ACSFs can lead to poor-quality ANN potentials. Here, we propose a gradient boosting decision tree (GBDT)-based framework for the automatic selection of optimal ACSFs. This framework takes uniformly distributed sets of ACSFs as input and evaluates their relative importance. The ACSFs with high average importance scores are selected and used to train an ANN potential. We applied this method to the Ge system, resulting in an ANN potential with root-mean-square errors (RMSE) of 10.2 meV/atom for energy and 84.8 meV/Å for force predictions, utilizing only 18 ACSFs to achieve a balance between accuracy and computational efficiency. The framework is validated using the grid searching method, demonstrating that ACSFs selected with our framework are in the optimal region. Furthermore, we also compared our method with commonly used feature selection algorithms. The results show that our algorithm outperforms the others in terms of effectiveness and accuracy. This study highlights the significance of the ACSF parameter effect on the ANN performance and presents a promising method for automatic ACSF selection, facilitating the development of machine learning potentials.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1549-9618
1549-9626
1549-9626
DOI:10.1021/acs.jctc.4c01176