Examples

Here we will give a demonstration of how to use some smact features. For a full set of work-through examples in Jupyter Notebook form check out the examples section of our GitHub repo. For workflows that have been used in real examples and in published work, visit our separate repository.

Element and species classes

The element and species classes are at the heart of smact functionality. Elements are the elements of the periodic table. Species are elements, with some additional information; the oxidation state and the coordination environment (if known). So for example the element iron can have many oxidation states and those oxidation states can have many coordination environments.

import smact

iron = smact.Element('Fe')
print("The element %s has %i oxidation states. They are %s." %
(iron.symbol, len(iron.oxidation_states), iron.oxidation_states))

The element Fe has 8 oxidation states. They are [-2, -1, 1, 2, 3, 4, 5, 6].

When an element has an oxidation state and coordination environment then it has additional features. For example, the Shannon radius 1 of the element is useful for calculating radius ratio rules 2, or for training neural networks 3 .

iron_square_planar = smact.Species('Fe', 2, '4_n')
print('Square planar iron has a Shannon radius of %s Angstrom' % iron_square_planar.shannon_radius)

Square planar iron has a Shannon radius of 0.77 Angstrom

List building

Often when using smact the aim will be to search over combinations of a set of elements. This is most efficiently achieved by setting up a dictionary of the elements that you want to search over. The easiest way to achieve this in smact is to first create a list of the symbols of the elements that you want to include, then to build a dictionary of the corresponding element objects.

The list can be built by hand, or if you want to cover a given range there is a helper function.

import smact

elements = smact.ordered_elements(13, 27)
print(elements)

['Al','Si','P', 'S', 'Cl', 'Ar', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co']

For doing searches across combinations of elements it is then quickest to load the element objects into a dictionary and search by key. This avoids having to repopulate the element class at each iteration of the search.

element_list = smact.element_dictionary(elements)
print(element_list)

{'Al': <smact.Element at 0x10ecc5890>,
 'Ar': <smact.Element at 0x10ecc5cd0>,
 'Ca': <smact.Element at 0x10ecc5a10>,
 'Cl': <smact.Element at 0x10ecc5d90>,
 'Co': <smact.Element at 0x10ecc5f90>,
 'Cr': <smact.Element at 0x10ecc5ed0>,
 'Fe': <smact.Element at 0x10ecc5f50>,
 'K': <smact.Element at 0x10ecc5e90>,
 'Mn': <smact.Element at 0x10ecc5f10>,
 'P': <smact.Element at 0x10ecc5990>,
 'S': <smact.Element at 0x10ecc5e10>,
 'Sc': <smact.Element at 0x10ecc5150>,
 'Si': <smact.Element at 0x10e8bf190>,
 'Ti': <smact.Element at 0x10ecc5dd0>,
 'V': <smact.Element at 0x10ecc5e50>}

Neutral combinations

One of the most basic tests for establishing sensible combinations of elements is that they should form charge-neutral combinations. This is a straightforward combinatorial problem of comparing oxidation states and allowed stoichiometries.

\(\Sigma_i Q_in_i = 0\)

where \(i\) are the elements in the compound and \(Q\) are the charges. We have a special function, smact_filter, which does this checking for a list of elements. The smact_filter also ensures that all elements specified to be anions have electronegitivities greater than all elements specified to be cations.

As input smact_filter takes:

  • els : a tuple of the elements to search over (required)

  • threshold: the upper limit of the stoichiometric ratios (default = 8)

  • species_unique: whether or not we want to consider elements in different oxidation states as unique in our results (default is False).

We can look for neutral combos.

import smact.screening
import pprint

elements = ['Ti', 'Al', 'O']
space = smact.element_dictionary(elements)
# We just want the element items from the dictionary
eles = [e[1] for e in space.items()]
# We set a threshold for the stoichiometry of 4
allowed_combinations = smact.screening.smact_filter(eles, threshold=4)
pprint.pprint(allowed_combinations)

[Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -2), stoichiometries=(1, 1, 1)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -2), stoichiometries=(1, 3, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -2), stoichiometries=(2, 4, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -2), stoichiometries=(3, 1, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -2), stoichiometries=(4, 2, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -1), stoichiometries=(1, 1, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -1), stoichiometries=(1, 2, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -1), stoichiometries=(1, 3, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -1), stoichiometries=(2, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 1, -1), stoichiometries=(3, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 2, -2), stoichiometries=(2, 1, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 2, -2), stoichiometries=(2, 2, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 2, -2), stoichiometries=(2, 3, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 2, -2), stoichiometries=(4, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 2, -1), stoichiometries=(1, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 2, -1), stoichiometries=(2, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 3, -2), stoichiometries=(1, 1, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 3, -2), stoichiometries=(3, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(1, 3, -1), stoichiometries=(1, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 1, -2), stoichiometries=(1, 2, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 1, -2), stoichiometries=(1, 4, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 1, -2), stoichiometries=(2, 2, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 1, -2), stoichiometries=(3, 2, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 1, -1), stoichiometries=(1, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 1, -1), stoichiometries=(1, 2, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 2, -2), stoichiometries=(1, 1, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 2, -2), stoichiometries=(1, 2, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 2, -2), stoichiometries=(1, 3, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 2, -2), stoichiometries=(2, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 2, -2), stoichiometries=(3, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 2, -1), stoichiometries=(1, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(2, 3, -2), stoichiometries=(1, 2, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(3, 1, -2), stoichiometries=(1, 1, 2)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(3, 1, -2), stoichiometries=(1, 3, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(3, 1, -1), stoichiometries=(1, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(3, 2, -2), stoichiometries=(2, 1, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(3, 3, -2), stoichiometries=(1, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(4, 1, -2), stoichiometries=(1, 2, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(4, 1, -2), stoichiometries=(1, 4, 4)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(4, 2, -2), stoichiometries=(1, 1, 3)),
Composition(element_symbols=('Ti', 'Al', 'O'), oxidation_states=(4, 2, -2), stoichiometries=(1, 2, 4))]

There is an example of how this function can be combined with multiprocessing to rapidly explore large subsets of chemical space.

Compound electronegativity

One property that is often used in high-throughput screening where band alignment is important is the compound electronegativity. Ginley and Butler showed how the simple geometric mean of the electronegitivities of a compound could be used to predict flat band potentials 4. smact has a built in function to calculate this property for a given composition.

import smact.properties

compound_electronegs = [smact.properties.compound_electroneg(elements = a[0], stoichs = a[1]) for \\
a in allowed_combinations]

print(compound_electronegs)

[4.319343517137848,
 4.729831837874991,
 4.462035251666306,
 4.337155845378665,
 5.0575817742802025,
 4.777171739263751,
 4.427325394494835,
 5.34030430325585,
 4.583732423414276,
 4.980129115226567,
 4.652147502981397,
 5.284089129411956,
 4.726884428924315,
 4.373001170931816,
 4.808336266651247,
 5.041995471272069,
 4.587722671269271,
 5.437592861777965,
 5.010966817423813,
 4.964781503487637,
 4.768922515748819,
 4.409142747625072,
 5.74200359520417,
 4.677126472294396]

Interfacing to machine learning

When preparing to build machine learning models, we have to convert the chemical compositions into something that can be fed into an algorithm. Many of the properties provided in smact are suitable for this, one can take properties like electronegativity, mass, electron affinity, etc. (for the full list see smact Python package).

One useful representation in machine learning is the one-hot-vector formulation. A similar construction to this can be used to encode a chemical formula. A vector of length covering the periodic table is constructed and each element is set to a number corresponding to the stoichiometric ratio of that element in the compound. For example we could convert \(Ba(OH)_2\)

ml_vector = smact.screening.ml_rep_generator(['Ba', 'H', 'O'], stoichs=[1, 2, 2])

There is also an example demonstrating the conversion of charge-neutral compositions produced by smact to a list of formulas using Pymatgen, or to a Pandas dataframe, both of which could then be used as input for a machine learning algorithm. For a full machine learning example that uses smact, there is a repository here which demonstrates a search for solar energy materials from the four-component (quaternary) oxide materials space.

1

“Revised effective ionic radii and systematic studies of interatomic distances in halides and chalcogenides” Acta Cryst. A. 32, 751–767 (1976).

2

“Crystal structure and chemical constitution” Trans. Faraday Soc. 25, 253-283 (1929).

3

“Deep neural networks for accurate predictions of crystal stability” Nat. Comms. 9, 3800 (2018).

4

“Prediction of flatband potentials at semiconductor‐electrolyte interfaces from atomic electronegativities” J. Electrochem. Soc. 125, 228-232 (1975).