ChemProps: A RESTful API enabled database for composite polymer name standardization

The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based...

Full description

Saved in:
Bibliographic Details
Published inJournal of cheminformatics Vol. 13; no. 1; pp. 22 - 13
Main Authors Hu, Bingyin, Lin, Anqi, Brinson, L. Catherine
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 12.03.2021
BioMed Central Ltd
Springer Nature B.V
BMC
Subjects
Online AccessGet full text
ISSN1758-2946
1758-2946
DOI10.1186/s13321-021-00502-6

Cover

More Information
Summary:The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based materials. The current solution of using a variety of different chemical identifiers has proven insufficient to address the challenge and is not intuitive for researchers. This work proposes a multi-algorithm-based mapping methodology entitled ChemProps that is optimized to solve the polymer indexing issue with easy-to-update design both in depth and in width. RESTful API is enabled for lightweight data exchange and easy integration across data systems. A weight factor is assigned to each algorithm to generate scores for candidate chemical names and optimized to maximize the minimum value of the score difference between the ground truth chemical name and the other candidate chemical names. Ten-fold validation is utilized on the 160 training data points to prevent overfitting issues. The obtained set of weight factors achieves a 100% test accuracy on the 54 test data points. The weight factors will evolve as ChemProps grows. With ChemProps, other polymer databases can remove duplicate entries and enable a more accurate “search by SMILES” function by using ChemProps as a common name-to-SMILES translator through API calls. ChemProps is also an excellent tool for auto-populating polymer properties thanks to its easy-to-update design.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1758-2946
1758-2946
DOI:10.1186/s13321-021-00502-6