Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction
Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatu...
Saved in:
| Published in | Scientific data Vol. 5; no. 1; p. 180111 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
London
Nature Publishing Group UK
19.06.2018
Nature Publishing Group |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2052-4463 2052-4463 |
| DOI | 10.1038/sdata.2018.111 |
Cover
| Summary: | Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery.
Design Type(s)
data integration objective • database creation objective
Measurement Type(s)
physicochemical characterization
Technology Type(s)
data item extraction from journal article
Factor Type(s)
Machine-accessible metadata file describing the reported data
(ISA-Tab format) |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 AC02-06CH11357 USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22). Materials Sciences & Engineering Division Engineering and Physical Sciences Research Council (EPSRC) |
| ISSN: | 2052-4463 2052-4463 |
| DOI: | 10.1038/sdata.2018.111 |