smact.data_loader module#
Provide data from text files while transparently caching for efficiency.
This module handles the loading of external data used to initialise the core smact.Element and smact.Species classes. It implements a transparent data-caching system to avoid a large amount of I/O when naively constructing several of these objects. It also implements a switchable system to print verbose warning messages about possible missing data (mainly for debugging purposes). In general these functions are used in the background and it is not necessary to use them directly.
- smact.data_loader.float_or_None(x: str) float | None[source]#
Cast a string to a float or to a None.
- smact.data_loader.lookup_element_data(symbol: str, copy: bool = True) dict | None[source]#
Retrieve tabulated data for an element.
The table “data/element_data.txt” contains a collection of relevant atomic data.
Args:#
symbol (str) : Atomic symbol for lookup copy (bool) : if True (default), return a copy of the
data dictionary, rather than a reference to the cached object – only used copy=False in performance-sensitive code and where you are certain the dictionary will not be modified!
Returns:#
dict: Dictionary of data for given element, keyed by column headings from data/element_data.txt.
- smact.data_loader.lookup_element_hhis(symbol: str) tuple[float, float] | None[source]#
Retrieve the HHI_R and HHI_p scores for an element.
Args:#
symbol : the atomic symbol of the element to look up.
Returns:#
tuple : (HHI_p, HHI_R)
Return None if values for the elements were not found in the external data.
- smact.data_loader.lookup_element_magpie_data(symbol: str, copy: bool = True) dict | None[source]#
Retrieve element data contained in the Magpie representation.
Taken from Ward, L., Agrawal, A., Choudhary, A. et al. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput Mater 2, 16028 (2016). https://doi.org/10.1038/npjcompumats.2016.28
- Parameters:
symbol – the atomic symbol of the element to look up.
copy – if True (default), return a copy of the data dictionary,
use (rather than a reference to a cached object -- only)
are (copy=False in performance-sensitive code and where you)
modified! (certain the dictionary will not be)
- Returns:
Magpie feature dictionary for the element, or None if not found.
- Return type:
dict
- smact.data_loader.lookup_element_oxidation_states(symbol: str, copy: bool = True) list[int] | None[source]#
Retrieve a list of known oxidation states for an element.
The oxidation states list used is the SMACT default (smact14) and most exhaustive list.
- Parameters:
symbol (str) – the atomic symbol of the element to look up.
copy (bool) – if True (default), return a copy of the oxidation-state list rather than a reference to the cached data.
- Returns:
Known oxidation states, or None if not found.
- Return type:
list
- smact.data_loader.lookup_element_oxidation_states_custom(symbol: str, filepath: str, copy: bool = True) list[int] | dict[str, list[int]] | None[source]#
Retrieve a list of known oxidation states for an element from a user-supplied file.
The cache is keyed by filepath, so calling with different files returns the correct data for each file (unlike the old single-global behaviour).
- Parameters:
symbol (str) – the atomic symbol to look up. Pass
"all"to return all symbols in the file.filepath (str) – path to the text file containing oxidation-state data.
copy (bool) – if True (default), return a copy of the list.
- Returns:
Known oxidation states for the element, or None if not found.
- Return type:
list
- smact.data_loader.lookup_element_oxidation_states_icsd(symbol: str, copy: bool = True) list[int] | None[source]#
Retrieve a list of known oxidation states for an element.
The oxidation states list used contains only those found in the ICSD (and judged to be non-spurious).
- Parameters:
symbol (str) – the atomic symbol of the element to look up.
copy (bool) – if True (default), return a copy of the list.
- Returns:
Known oxidation states, or None if not found.
- Return type:
list
- smact.data_loader.lookup_element_oxidation_states_icsd24(symbol: str, copy: bool = True) list[int] | None[source]#
Retrieve a list of known oxidation states for an element.
The oxidation states list used contains only those found in the 2024 version of the ICSD (with ≥5 reports).
- Parameters:
symbol (str) – the atomic symbol of the element to look up.
copy (bool) – if True (default), return a copy of the list.
- Returns:
Known oxidation states, or None if not found.
- Return type:
list
- smact.data_loader.lookup_element_oxidation_states_sp(symbol: str, copy: bool = True) list[int] | None[source]#
Retrieve a list of known oxidation states for an element.
The oxidation states list used contains only those that are in the Pymatgen default lambda table for structure prediction.
- Parameters:
symbol (str) – the atomic symbol of the element to look up.
copy (bool) – if True (default), return a copy of the list.
- Returns:
Known oxidation states, or None if not found.
- Return type:
list
- smact.data_loader.lookup_element_oxidation_states_wiki(symbol: str, copy: bool = True) list[int] | None[source]#
Retrieve a list of known oxidation states for an element.
The oxidation states list used contains only those that appear on Wikipedia (https://en.wikipedia.org/wiki/Template:List_of_oxidation_states_of_the_elements).
- Parameters:
symbol (str) – the atomic symbol of the element to look up.
copy (bool) – if True (default), return a copy of the list.
- Returns:
Known oxidation states, or None if not found.
- Return type:
list
- smact.data_loader.lookup_element_shannon_radius_data(symbol: str, copy: bool = True) list[dict] | None[source]#
Retrieve Shannon radii for known states of an element.
Retrieve Shannon radii for known oxidation states and coordination environments of an element.
Args:#
symbol (str) : the atomic symbol of the element to look up.
copy (Optional(bool)): if True (default), return a copy of the data dictionary, rather than a reference to the cached object – only use copy=False in performance-sensitive code and where you are certain the dictionary will not be modified!
Returns:#
- list:
Shannon radii datasets.
Returns None if the element was not found among the external data.
Shannon radii datasets are dictionaries with the keys:
- charge
int charge
- coordination
str coordination (e.g. “4_n” for 4-fold, see Shannon data)
- crystal_radius
float
- ionic_radius
float
- comment
str
- smact.data_loader.lookup_element_shannon_radius_data_extendedML(symbol: str, copy: bool = True) list[dict] | None[source]#
Retrieve the machine learned extended Shannon radii for known states of an element.
Retrieve Shannon radii for known oxidation states and coordination environments of an element.
Source of extended radii is: Baloch, A.A., Alqahtani, S.M., Mumtaz, F., Muqaibel, A.H., Rashkeev, S.N. and Alharbi, F.H., 2021. Extending Shannon’s Ionic Radii Database Using Machine Learning. arXiv preprint arXiv:2101.00269.
Args:#
symbol (str) : the atomic symbol of the element to look up.
copy (Optional(bool)): if True (default), return a copy of the data dictionary, rather than a reference to the cached object – only use copy=False in performance-sensitive code and where you are certain the dictionary will not be modified!
Returns:#
- list:
Extended Shannon radii datasets.
Returns None if the element was not found among the external data.
Shannon radii datasets are dictionaries with the keys:
- charge
int charge
- coordination
str coordination (e.g. “4_n” for 4-fold, see Shannon data)
- crystal_radius
float
- ionic_radius
float
- comment
str
- smact.data_loader.lookup_element_sse2015_data(symbol: str, copy: bool = True) list[dict] | None[source]#
Retrieve SSE (2015) data for element in oxidation state.
Retrieve the solid-state energy (SSE2015) data for an element in an oxidation state. Taken from J. Solid State Chem., 2015, 231, pp138-144, DOI: 10.1016/j.jssc.2015.07.037.
Args:#
symbol : the atomic symbol of the element to look up. copy: if True (default), return a copy of the data dictionary, rather than a reference to a cached object – only use copy=False in performance-sensitive code and where you are certain the dictionary will not be modified!
Returns:#
- listSSE datasets for the element, or None
if the element was not found among the external data.
SSE datasets are dictionaries with the keys:
- OxidationState
int
- SolidStateEnergy2015
float SSE2015
- smact.data_loader.lookup_element_sse_data(symbol: str) dict | None[source]#
Retrieve the solid-state energy (SSE) data for an element.
Taken from J. Am. Chem. Soc., 2011, 133 (42), pp 16852-16960, DOI: 10.1021/ja204670s
Args:#
symbol : the atomic symbol of the element to look up.
Returns:#
- dictSSE data for the element, or None
if the element was not found among the external data.
Dictionary keys:
- AtomicNumber
int
- SolidStateEnergy
float SSE
- IonisationPotential
float
- ElectronAffinity
float
- MullikenElectronegativity
float
- SolidStateRenormalisationEnergy
float
- smact.data_loader.lookup_element_sse_pauling_data(symbol: str, copy: bool = True) dict | None[source]#
Retrieve Pauling SSE data.
Retrieve the solid-state energy (SSEPauling) data for an element from the regression fit when SSE2015 is plotted against Pauling electronegativity. Taken from J. Solid State Chem., 2015, 231, pp138-144, DOI: 10.1016/j.jssc.2015.07.037
Args:#
symbol (str) : the atomic symbol of the element to look up. copy (bool) : if True (default), return a copy of the data dictionary,
rather than a reference to the cached object.
- Returns: A dictionary containing the SSE2015 dataset for the
element, or None if the element was not found among the external data.
- smact.data_loader.lookup_element_valence_data(symbol: str, copy: bool = True) dict | None[source]#
Retrieve valence electron data.
For d-block elements, the s and d electrons contribute to NValence. For p-block elements, the s and p electrons contribute to NValence. For s- and f-block elements, NValence is calculated from the noble gas electron configuration.
- Parameters:
symbol – the atomic symbol of the element to look up.
copy – if True (default), return a copy of the data dictionary,
use (rather than a reference to a cached object -- only)
are (copy=False in performance-sensitive code and where you)
modified! (certain the dictionary will not be)
- Returns:
the number of valence electrons Returns None if the element was not found among the external data.
- Return type:
NValence (int)