Automated generation and ensemble-learned matching of X-ray absorption spectra
X-ray absorption spectroscopy (XAS) is a widely used materials characterization technique to determine oxidation states, coordination environment, and other local atomic structure information. Analysis of XAS relies on comparison of measured spectra to reliable reference spectra. However, existing databases of XAS spectra are highly limited both in terms of the number of reference spectra available as well as the breadth of chemistry coverage. In this work, we report the development of XASdb, a large database of computed reference XAS, and an Ensemble-Learned Spectra IdEntification (ELSIE) algorithm for the matching of spectra. XASdb currently hosts more than 800,000 K-edge X-ray absorption near-edge spectra (XANES) for over 40,000 materials from the open-science Materials Project database. We discuss a high-throughput automation framework for FEFF calculations, built on robust, rigorously benchmarked parameters. FEFF is a computer program uses a real-space Green’s function approach to calculate X-ray absorption spectra. We will demonstrate that the ELSIE algorithm, which combines 33 weak “learners” comprising a set of preprocessing steps and a similarity metric, can achieve up to 84.2% accuracy in identifying the correct oxidation state and coordination environment of a test set of 19 K-edge XANES spectra encompassing a diverse range of chemistries and crystal structures. The XASdb with the ELSIE algorithm has been integrated into a web application in the Materials Project, providing an important new public resource for the analysis of XAS to all materials researchers. Finally, the ELSIE algorithm itself has been made available as part of veidt, an open source machine-learning library for materials science.