Computing descriptors from python
This tutorial shows how to use quippy to calculate various descriptors for a given atomic structure, contained in an Atoms object.
[1]:
import quippy
[2]:
from quippy import descriptors
Create a configuration:
[3]:
import ase
import ase.build
[4]:
# this is a two atom structure, not the cubic with 8
at = ase.build.bulk('C', 'diamond', 3.5)
print(at.get_positions())
[[0. 0. 0. ]
[0.875 0.875 0.875]]
[5]:
at.get_chemical_symbols()
[5]:
['C', 'C']
[6]:
at.get_pbc()
[6]:
array([ True, True, True])
All descriptors are instantiated by a call to Descriptor()
, which takes the descriptor initialisation string as its only argument. For a list of available descriptors and their parameters, see the following list, auto-generated using the following command: quip descriptor_str="--help"
bispectrum_so4 bispectrum_so3 behler distance_2b coordination angle_3b co_angle_3b co_distance_2b cosnx trihis water_monomer water_dimer A2_dimer AB_dimer bond_real_space atom_real_space power_so3 power_so4 soap AN_monomer general_monomer general_dimer general_trimer rdf as_distance_2b molecule_lo_d alex com_dimer distance_Nb
A simple descriptor: pairwise distances
Here we use a simple pair distance between carbon atoms, with a cutoff of 1.5 Angstrom. There are several descriptors that can do this, one is distance_2b
, which takes a cutoff argument, and two Z values to specify the atom types. Alternatively, the distance_Nb
descriptor could also do this, with order=2
, and it takes a string of Zs to specify the atom types. This is more general, order=3
is a triangle-like three-body descriptor of the three sides of a triangle of 3 atoms.
[7]:
desc = descriptors.Descriptor("distance_2b Z1=6 Z2=6 cutoff=4")
The descriptor dimension is the length of the descriptor vector. For the scalar distance this is 1.
[8]:
desc.n_dim # number of dimensions
[8]:
1
This descriptor is very simple: it is scalar (dimension 1), and hence only has a single permutation.
[9]:
desc.n_perm # number of permutations
[9]:
1
[10]:
desc.permutations() # array of permutation arrays
[10]:
array([[1]], dtype=int32)
Many descriptors rely on the neighbour connectivity, which is automatically calculated behind the scenes.
We can now calculate how many instances of this descriptor are found in an Atoms
object:
[12]:
desc.count(at)
[12]:
92
This also works transparently for iterables (such as lists), returning a list of the counts.
We can also calculate the actual descriptor values – in this case, the list of pairwise distances:
[13]:
d = desc.calc(at)
d
[13]:
{'covariance_cutoff': array([0.30420743, 0.30420743, 0.30420743, 1. , 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 0.30420743, 0.30420743, 1. , 1. ,
1. , 1. , 1. , 0.30420743, 1. ,
1. , 1. , 0.30420743, 1. , 0.30420743,
1. , 1. , 0.30420743, 1. , 0.30420743,
1. , 1. , 1. , 1. , 1. ,
0.30420743, 1. , 1. , 0.30420743, 1. ,
1. , 0.30420743, 0.30420743, 0.30420743, 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 0.30420743, 0.30420743, 1. , 1. ,
1. , 0.30420743, 1. , 1. , 0.30420743,
0.30420743, 1. , 0.30420743, 0.30420743, 1. ,
1. , 0.30420743, 1. , 0.30420743, 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 1. ]), 'data': array([[3.81403658],
[3.81403658],
[3.81403658],
[2.90204669],
[2.90204669],
[3.5 ],
[2.90204669],
[1.51554446],
[2.47487373],
[2.90204669],
[2.47487373],
[2.90204669],
[3.5 ],
[2.90204669],
[2.47487373],
[3.5 ],
[3.81403658],
[3.81403658],
[2.90204669],
[1.51554446],
[2.47487373],
[2.90204669],
[2.47487373],
[3.81403658],
[1.51554446],
[2.47487373],
[1.51554446],
[3.81403658],
[2.47487373],
[3.81403658],
[2.90204669],
[2.47487373],
[3.81403658],
[2.47487373],
[3.81403658],
[2.90204669],
[3.5 ],
[2.90204669],
[2.47487373],
[3.5 ],
[3.81403658],
[2.90204669],
[2.47487373],
[3.81403658],
[2.47487373],
[3.5 ],
[3.81403658],
[3.81403658],
[3.81403658],
[2.90204669],
[2.90204669],
[2.90204669],
[1.51554446],
[2.90204669],
[2.90204669],
[2.90204669],
[3.81403658],
[3.81403658],
[2.90204669],
[1.51554446],
[2.90204669],
[3.81403658],
[1.51554446],
[1.51554446],
[3.81403658],
[3.81403658],
[2.90204669],
[3.81403658],
[3.81403658],
[2.90204669],
[2.90204669],
[3.81403658],
[2.90204669],
[3.81403658],
[3.5 ],
[2.47487373],
[2.47487373],
[3.5 ],
[2.47487373],
[3.5 ],
[2.47487373],
[2.47487373],
[2.47487373],
[2.47487373],
[2.47487373],
[2.47487373],
[3.5 ],
[2.47487373],
[3.5 ],
[2.47487373],
[2.47487373],
[3.5 ]]), 'has_data': array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1])}
Notice that the first array is called covariance_cutoff
, and it supplies the value of a cutoff function, implicit in all descriptors, which takes the descriptor value to zero as the atoms approach the cutoff, i.e. in this case as the distance between the two atoms approaches the cutoff. It is more complicated for three-body and higher-body descriptors, but the end result is always a descriptor which changes smoothly with atomic positions.
Here is a histogram of the resulting descriptor array, i.e. of the interatomic distances
[14]:
import matplotlib.pyplot as plt
p = plt.hist(d['data'])
Calculate size of descriptor data:
[15]:
n_desc, n_cross = desc.sizes(at)
print("n_desc=%d n_cross=%d" % (n_desc,n_cross))
n_desc=92 n_cross=184
n_desc
is number of descriptors, n_cross
is number of gradients
Gradients are returned in a Dictionary
object ( if the grad=True
option is set ) which has the following elements: - data
= the descriptor values (equivalent to desc.calc(at)
) # earlier descriptor
- grad_data
= the gradients # earlier grad
- ii
= the indices (i_desc
, i_atom
, i_coord
) # this was index
, right?
[16]:
res = desc.calc(at, grad=True)
list(res)
[16]:
['has_grad_data',
'ii',
'pos',
'grad_covariance_cutoff',
'covariance_cutoff',
'data',
'has_data',
'grad_data']
[17]:
res['data'][:,:10]
[17]:
array([[3.81403658],
[3.81403658],
[3.81403658],
[2.90204669],
[2.90204669],
[3.5 ],
[2.90204669],
[1.51554446],
[2.47487373],
[2.90204669],
[2.47487373],
[2.90204669],
[3.5 ],
[2.90204669],
[2.47487373],
[3.5 ],
[3.81403658],
[3.81403658],
[2.90204669],
[1.51554446],
[2.47487373],
[2.90204669],
[2.47487373],
[3.81403658],
[1.51554446],
[2.47487373],
[1.51554446],
[3.81403658],
[2.47487373],
[3.81403658],
[2.90204669],
[2.47487373],
[3.81403658],
[2.47487373],
[3.81403658],
[2.90204669],
[3.5 ],
[2.90204669],
[2.47487373],
[3.5 ],
[3.81403658],
[2.90204669],
[2.47487373],
[3.81403658],
[2.47487373],
[3.5 ],
[3.81403658],
[3.81403658],
[3.81403658],
[2.90204669],
[2.90204669],
[2.90204669],
[1.51554446],
[2.90204669],
[2.90204669],
[2.90204669],
[3.81403658],
[3.81403658],
[2.90204669],
[1.51554446],
[2.90204669],
[3.81403658],
[1.51554446],
[1.51554446],
[3.81403658],
[3.81403658],
[2.90204669],
[3.81403658],
[3.81403658],
[2.90204669],
[2.90204669],
[3.81403658],
[2.90204669],
[3.81403658],
[3.5 ],
[2.47487373],
[2.47487373],
[3.5 ],
[2.47487373],
[3.5 ],
[2.47487373],
[2.47487373],
[2.47487373],
[2.47487373],
[2.47487373],
[2.47487373],
[3.5 ],
[2.47487373],
[3.5 ],
[2.47487373],
[2.47487373],
[3.5 ]])
[18]:
print(res['grad_data'][:,:10].shape)
res['grad_data'][:,:10] #this is outputted as a verly long thing
(184, 3, 1)
[18]:
array([[[ 0.6882472 ],
[ 0.6882472 ],
[-0.22941573]],
[[-0.6882472 ],
[-0.6882472 ],
[ 0.22941573]],
[[ 0.6882472 ],
[ 0.22941573],
[-0.6882472 ]],
[[-0.6882472 ],
[-0.22941573],
[ 0.6882472 ]],
[[ 0.22941573],
[ 0.6882472 ],
[-0.6882472 ]],
[[-0.22941573],
[-0.6882472 ],
[ 0.6882472 ]],
[[ 0.90453403],
[ 0.30151134],
[ 0.30151134]],
[[-0.90453403],
[-0.30151134],
[-0.30151134]],
[[ 0.90453403],
[-0.30151134],
[-0.30151134]],
[[-0.90453403],
[ 0.30151134],
[ 0.30151134]],
[[ 1. ],
[-0. ],
[-0. ]],
[[-1. ],
[ 0. ],
[ 0. ]],
[[ 0.30151134],
[ 0.90453403],
[ 0.30151134]],
[[-0.30151134],
[-0.90453403],
[-0.30151134]],
[[ 0.57735027],
[ 0.57735027],
[-0.57735027]],
[[-0.57735027],
[-0.57735027],
[ 0.57735027]],
[[ 0.70710678],
[ 0.70710678],
[-0. ]],
[[-0.70710678],
[-0.70710678],
[ 0. ]],
[[ 0.30151134],
[-0.30151134],
[-0.90453403]],
[[-0.30151134],
[ 0.30151134],
[ 0.90453403]],
[[ 0.70710678],
[-0. ],
[-0.70710678]],
[[-0.70710678],
[ 0. ],
[ 0.70710678]],
[[-0.30151134],
[ 0.90453403],
[-0.30151134]],
[[ 0.30151134],
[-0.90453403],
[ 0.30151134]],
[[-0. ],
[ 1. ],
[-0. ]],
[[ 0. ],
[-1. ],
[ 0. ]],
[[-0.30151134],
[ 0.30151134],
[-0.90453403]],
[[ 0.30151134],
[-0.30151134],
[ 0.90453403]],
[[-0. ],
[ 0.70710678],
[-0.70710678]],
[[ 0. ],
[-0.70710678],
[ 0.70710678]],
[[-0. ],
[-0. ],
[-1. ]],
[[ 0. ],
[ 0. ],
[ 1. ]],
[[ 0.6882472 ],
[-0.22941573],
[ 0.6882472 ]],
[[-0.6882472 ],
[ 0.22941573],
[-0.6882472 ]],
[[ 0.6882472 ],
[-0.6882472 ],
[ 0.22941573]],
[[-0.6882472 ],
[ 0.6882472 ],
[-0.22941573]],
[[ 0.30151134],
[ 0.30151134],
[ 0.90453403]],
[[-0.30151134],
[-0.30151134],
[-0.90453403]],
[[ 0.57735027],
[-0.57735027],
[ 0.57735027]],
[[-0.57735027],
[ 0.57735027],
[-0.57735027]],
[[ 0.70710678],
[-0. ],
[ 0.70710678]],
[[-0.70710678],
[ 0. ],
[-0.70710678]],
[[ 0.30151134],
[-0.90453403],
[-0.30151134]],
[[-0.30151134],
[ 0.90453403],
[ 0.30151134]],
[[ 0.70710678],
[-0.70710678],
[-0. ]],
[[-0.70710678],
[ 0.70710678],
[ 0. ]],
[[-0.22941573],
[ 0.6882472 ],
[ 0.6882472 ]],
[[ 0.22941573],
[-0.6882472 ],
[-0.6882472 ]],
[[-0.57735027],
[ 0.57735027],
[ 0.57735027]],
[[ 0.57735027],
[-0.57735027],
[-0.57735027]],
[[-0. ],
[ 0.70710678],
[ 0.70710678]],
[[ 0. ],
[-0.70710678],
[-0.70710678]],
[[-0.57735027],
[-0.57735027],
[-0.57735027]],
[[ 0.57735027],
[ 0.57735027],
[ 0.57735027]],
[[-0.22941573],
[-0.6882472 ],
[-0.6882472 ]],
[[ 0.22941573],
[ 0.6882472 ],
[ 0.6882472 ]],
[[-0. ],
[-0.70710678],
[-0.70710678]],
[[ 0. ],
[ 0.70710678],
[ 0.70710678]],
[[-0.6882472 ],
[ 0.6882472 ],
[ 0.22941573]],
[[ 0.6882472 ],
[-0.6882472 ],
[-0.22941573]],
[[-0.90453403],
[ 0.30151134],
[-0.30151134]],
[[ 0.90453403],
[-0.30151134],
[ 0.30151134]],
[[-0.70710678],
[ 0.70710678],
[-0. ]],
[[ 0.70710678],
[-0.70710678],
[ 0. ]],
[[-0.6882472 ],
[-0.22941573],
[-0.6882472 ]],
[[ 0.6882472 ],
[ 0.22941573],
[ 0.6882472 ]],
[[-0.70710678],
[-0. ],
[-0.70710678]],
[[ 0.70710678],
[ 0. ],
[ 0.70710678]],
[[ 0.22941573],
[-0.6882472 ],
[ 0.6882472 ]],
[[-0.22941573],
[ 0.6882472 ],
[-0.6882472 ]],
[[-0.30151134],
[-0.30151134],
[ 0.90453403]],
[[ 0.30151134],
[ 0.30151134],
[-0.90453403]],
[[-0. ],
[-0. ],
[ 1. ]],
[[ 0. ],
[ 0. ],
[-1. ]],
[[-0.30151134],
[-0.90453403],
[ 0.30151134]],
[[ 0.30151134],
[ 0.90453403],
[-0.30151134]],
[[-0. ],
[-0.70710678],
[ 0.70710678]],
[[ 0. ],
[ 0.70710678],
[-0.70710678]],
[[-0. ],
[-1. ],
[-0. ]],
[[ 0. ],
[ 1. ],
[ 0. ]],
[[-0.6882472 ],
[ 0.22941573],
[ 0.6882472 ]],
[[ 0.6882472 ],
[-0.22941573],
[-0.6882472 ]],
[[-0.90453403],
[-0.30151134],
[ 0.30151134]],
[[ 0.90453403],
[ 0.30151134],
[-0.30151134]],
[[-0.70710678],
[-0. ],
[ 0.70710678]],
[[ 0.70710678],
[ 0. ],
[-0.70710678]],
[[-0.6882472 ],
[-0.6882472 ],
[-0.22941573]],
[[ 0.6882472 ],
[ 0.6882472 ],
[ 0.22941573]],
[[-0.70710678],
[-0.70710678],
[-0. ]],
[[ 0.70710678],
[ 0.70710678],
[ 0. ]],
[[-1. ],
[-0. ],
[-0. ]],
[[ 1. ],
[ 0. ],
[ 0. ]],
[[-0.6882472 ],
[-0.6882472 ],
[ 0.22941573]],
[[ 0.6882472 ],
[ 0.6882472 ],
[-0.22941573]],
[[-0.6882472 ],
[-0.22941573],
[ 0.6882472 ]],
[[ 0.6882472 ],
[ 0.22941573],
[-0.6882472 ]],
[[-0.22941573],
[-0.6882472 ],
[ 0.6882472 ]],
[[ 0.22941573],
[ 0.6882472 ],
[-0.6882472 ]],
[[-0.90453403],
[-0.30151134],
[-0.30151134]],
[[ 0.90453403],
[ 0.30151134],
[ 0.30151134]],
[[-0.90453403],
[ 0.30151134],
[ 0.30151134]],
[[ 0.90453403],
[-0.30151134],
[-0.30151134]],
[[-0.30151134],
[-0.90453403],
[-0.30151134]],
[[ 0.30151134],
[ 0.90453403],
[ 0.30151134]],
[[-0.57735027],
[-0.57735027],
[ 0.57735027]],
[[ 0.57735027],
[ 0.57735027],
[-0.57735027]],
[[-0.30151134],
[ 0.30151134],
[ 0.90453403]],
[[ 0.30151134],
[-0.30151134],
[-0.90453403]],
[[ 0.30151134],
[-0.90453403],
[ 0.30151134]],
[[-0.30151134],
[ 0.90453403],
[-0.30151134]],
[[ 0.30151134],
[-0.30151134],
[ 0.90453403]],
[[-0.30151134],
[ 0.30151134],
[-0.90453403]],
[[-0.6882472 ],
[ 0.22941573],
[-0.6882472 ]],
[[ 0.6882472 ],
[-0.22941573],
[ 0.6882472 ]],
[[-0.6882472 ],
[ 0.6882472 ],
[-0.22941573]],
[[ 0.6882472 ],
[-0.6882472 ],
[ 0.22941573]],
[[-0.30151134],
[-0.30151134],
[-0.90453403]],
[[ 0.30151134],
[ 0.30151134],
[ 0.90453403]],
[[-0.57735027],
[ 0.57735027],
[-0.57735027]],
[[ 0.57735027],
[-0.57735027],
[ 0.57735027]],
[[-0.30151134],
[ 0.90453403],
[ 0.30151134]],
[[ 0.30151134],
[-0.90453403],
[-0.30151134]],
[[ 0.22941573],
[-0.6882472 ],
[-0.6882472 ]],
[[-0.22941573],
[ 0.6882472 ],
[ 0.6882472 ]],
[[ 0.57735027],
[-0.57735027],
[-0.57735027]],
[[-0.57735027],
[ 0.57735027],
[ 0.57735027]],
[[ 0.57735027],
[ 0.57735027],
[ 0.57735027]],
[[-0.57735027],
[-0.57735027],
[-0.57735027]],
[[ 0.22941573],
[ 0.6882472 ],
[ 0.6882472 ]],
[[-0.22941573],
[-0.6882472 ],
[-0.6882472 ]],
[[ 0.6882472 ],
[-0.6882472 ],
[-0.22941573]],
[[-0.6882472 ],
[ 0.6882472 ],
[ 0.22941573]],
[[ 0.90453403],
[-0.30151134],
[ 0.30151134]],
[[-0.90453403],
[ 0.30151134],
[-0.30151134]],
[[ 0.6882472 ],
[ 0.22941573],
[ 0.6882472 ]],
[[-0.6882472 ],
[-0.22941573],
[-0.6882472 ]],
[[-0.22941573],
[ 0.6882472 ],
[-0.6882472 ]],
[[ 0.22941573],
[-0.6882472 ],
[ 0.6882472 ]],
[[ 0.30151134],
[ 0.30151134],
[-0.90453403]],
[[-0.30151134],
[-0.30151134],
[ 0.90453403]],
[[ 0.30151134],
[ 0.90453403],
[-0.30151134]],
[[-0.30151134],
[-0.90453403],
[ 0.30151134]],
[[ 0.6882472 ],
[-0.22941573],
[-0.6882472 ]],
[[-0.6882472 ],
[ 0.22941573],
[ 0.6882472 ]],
[[ 0.90453403],
[ 0.30151134],
[-0.30151134]],
[[-0.90453403],
[-0.30151134],
[ 0.30151134]],
[[ 0.6882472 ],
[ 0.6882472 ],
[ 0.22941573]],
[[-0.6882472 ],
[-0.6882472 ],
[-0.22941573]],
[[ 1. ],
[-0. ],
[-0. ]],
[[-1. ],
[ 0. ],
[ 0. ]],
[[ 0.70710678],
[ 0.70710678],
[-0. ]],
[[-0.70710678],
[-0.70710678],
[ 0. ]],
[[ 0.70710678],
[-0. ],
[-0.70710678]],
[[-0.70710678],
[ 0. ],
[ 0.70710678]],
[[-0. ],
[ 1. ],
[-0. ]],
[[ 0. ],
[-1. ],
[ 0. ]],
[[-0. ],
[ 0.70710678],
[-0.70710678]],
[[ 0. ],
[-0.70710678],
[ 0.70710678]],
[[-0. ],
[-0. ],
[-1. ]],
[[ 0. ],
[ 0. ],
[ 1. ]],
[[ 0.70710678],
[-0. ],
[ 0.70710678]],
[[-0.70710678],
[ 0. ],
[-0.70710678]],
[[ 0.70710678],
[-0.70710678],
[-0. ]],
[[-0.70710678],
[ 0.70710678],
[ 0. ]],
[[-0. ],
[ 0.70710678],
[ 0.70710678]],
[[ 0. ],
[-0.70710678],
[-0.70710678]],
[[-0. ],
[-0.70710678],
[-0.70710678]],
[[ 0. ],
[ 0.70710678],
[ 0.70710678]],
[[-0.70710678],
[ 0.70710678],
[-0. ]],
[[ 0.70710678],
[-0.70710678],
[ 0. ]],
[[-0.70710678],
[-0. ],
[-0.70710678]],
[[ 0.70710678],
[ 0. ],
[ 0.70710678]],
[[-0. ],
[-0. ],
[ 1. ]],
[[ 0. ],
[ 0. ],
[-1. ]],
[[-0. ],
[-0.70710678],
[ 0.70710678]],
[[ 0. ],
[ 0.70710678],
[-0.70710678]],
[[-0. ],
[-1. ],
[-0. ]],
[[ 0. ],
[ 1. ],
[ 0. ]],
[[-0.70710678],
[-0. ],
[ 0.70710678]],
[[ 0.70710678],
[ 0. ],
[-0.70710678]],
[[-0.70710678],
[-0.70710678],
[-0. ]],
[[ 0.70710678],
[ 0.70710678],
[ 0. ]],
[[-1. ],
[-0. ],
[-0. ]],
[[ 1. ],
[ 0. ],
[ 0. ]]])
[19]:
# this does not even work with the latest py2...
# res.index[:,:10]
A many-body descriptor: SOAP
[20]:
desc = descriptors.Descriptor("soap cutoff=3 l_max=4 n_max=4 atom_sigma=0.5 n_Z=1 Z={6} ")
[21]:
# at.set_cutoff(desc.cutoff())
# at.calc_connect()
There are now only 8 descriptors, because SOAP produces one for each atom in the structure
[22]:
desc.sizes(at)
[22]:
(2, 58)
But each descriptor now is a long vector, because it encodes the entire environment of the atom up to the cutoff. The length of the vector depends on l_max
and n_max
and also on the number of atom types.
[23]:
desc.n_dim
[23]:
51
[24]:
desc.calc(at)
[24]:
{'covariance_cutoff': array([1., 1.]),
'data': array([[ 2.63487898e-01, 4.88846681e-35, 4.70073620e-36,
2.27467236e-03, 7.81144539e-04, 5.51080842e-01,
-2.80880401e-35, -1.00070027e-34, 2.19176356e-02,
7.81338076e-03, 5.76288508e-01, 1.95207652e-33,
2.57768609e-33, 1.05593834e-01, 3.90765830e-02,
2.84225658e-01, -2.98268236e-34, 2.34728485e-37,
-2.01220340e-03, 2.59681991e-03, 4.20342133e-01,
-4.06607819e-34, 2.35117001e-34, -1.37098193e-02,
1.83668391e-02, 1.53297789e-01, 1.25598722e-33,
4.10252085e-35, 8.90010050e-04, 4.31640581e-03,
-2.51452500e-03, -9.71796031e-37, 2.44522012e-38,
1.63990654e-05, -2.38148142e-05, -3.71873816e-03,
1.08673965e-35, -2.31968275e-36, 1.11732355e-04,
-1.68437888e-04, -1.91797776e-03, 2.17347930e-36,
1.57834671e-37, -1.02578685e-05, -5.59812552e-05,
1.19983423e-05, 3.78678845e-38, 8.02754752e-39,
5.91138644e-08, 3.63022045e-07, 0.00000000e+00],
[ 2.63487898e-01, 7.09330130e-36, 4.94500281e-36,
2.27467236e-03, 7.81144539e-04, 5.51080842e-01,
-5.35010288e-35, 4.97300625e-35, 2.19176356e-02,
7.81338076e-03, 5.76288508e-01, 3.17779898e-34,
2.52011995e-34, 1.05593834e-01, 3.90765830e-02,
2.84225658e-01, 3.21006173e-35, 0.00000000e+00,
-2.01220340e-03, 2.59681991e-03, 4.20342133e-01,
-2.56804938e-34, 0.00000000e+00, -1.37098193e-02,
1.83668391e-02, 1.53297789e-01, 1.21059009e-34,
0.00000000e+00, 8.90010050e-04, 4.31640581e-03,
-2.51452500e-03, -8.35953575e-38, 0.00000000e+00,
1.63990654e-05, -2.38148142e-05, -3.71873816e-03,
-3.34381430e-37, 0.00000000e+00, 1.11732355e-04,
-1.68437888e-04, -1.91797776e-03, 0.00000000e+00,
0.00000000e+00, -1.02578685e-05, -5.59812552e-05,
1.19983423e-05, 3.69442776e-39, 0.00000000e+00,
5.91138644e-08, 3.63022045e-07, 0.00000000e+00]]),
'has_data': array([1, 1])}
Note that the cutoff
array is now all 1, because SOAP takes the cutoff into account when it computes the descriptor vector
We now add a hydrogen atom to the structure
[25]:
# at.add_atoms((0.2,0.2,0.2),(1))
at.append(ase.Atom('H', (0.2,0.2,0.2)))
[26]:
# at.calc_connect()
desc.sizes(at)
[26]:
(2, 84)
The descriptor sizes did not change! This is because the descriptor was set up above to only look at carbon atoms (Z=6). We need a new descriptor that takes account of H atoms.
[27]:
desc = descriptors.Descriptor("soap cutoff=3 l_max=4 n_max=4 atom_sigma=0.5 n_Z=2 Z={1 6} n_species=2 species_Z={1 6}")
The specification of which atoms are used as SOAP centers is separate to the specification of which atoms are taken into account in the environment. The n_Z=2 Z={1 6}
options specify that both H and C are to be taken as SOAP centers. The n_species=2 species_Z={1 6}
options specify the atom types to be taken into account in the environment.
Now the dimensionality of the SOAP descriptor goes up:
[28]:
desc.n_dim
[28]:
181
And the size also increases to 9 (and the cross terms go up too), reflecting the fact that there are 9 atoms altogether.
[29]:
desc.sizes(at)
[29]:
(3, 123)
[30]:
desc.calc(at)
[30]:
{'covariance_cutoff': array([1., 1., 1.]),
'data': array([[ 9.33322279e-02, 1.08024682e-04, 4.43873210e-06,
1.57881914e-05, 4.16237446e-05, 1.02738757e-01,
9.69937759e-04, 3.35597718e-05, 1.56436845e-04,
4.25163207e-04, 5.65466638e-02, 4.35446438e-03,
1.26867117e-04, 7.77168436e-04, 2.19193921e-03,
1.06926399e-01, -9.09191244e-04, -1.09555484e-04,
3.54608882e-05, 2.39572569e-04, 8.32286056e-02,
-5.77246248e-03, -5.85705351e-04, 2.83218039e-04,
2.06349068e-03, 6.12503048e-02, 3.82611041e-03,
1.35200818e-03, 3.21785045e-04, 3.39125126e-03,
-9.52478353e-04, 7.83266259e-06, 1.02663823e-06,
-3.50961837e-07, -2.10019423e-06, -7.41383287e-04,
4.97296374e-05, 5.48861164e-06, -2.78291927e-06,
-1.81482425e-05, -7.71602143e-04, -4.66151054e-05,
-1.79175088e-05, -4.27301110e-06, -4.27179169e-05,
4.86013799e-06, 2.83965675e-07, 1.18726028e-07,
2.83825414e-08, 2.69065369e-07, 1.97091938e-01,
-6.36235164e-20, -2.67092939e-21, 8.79900409e-05,
1.69491950e-04, 1.53411013e-01, -4.03946214e-19,
-1.42793183e-20, 5.34939158e-04, 1.32986131e-03,
1.59664061e-01, 3.78647349e-19, 4.66146683e-20,
-7.95581905e-04, 1.90184303e-03, -1.42225459e-03,
-3.26203859e-21, -4.36823414e-22, 7.33235928e-06,
-1.68863366e-05, 2.08101921e-01, 3.86089587e-35,
3.71262682e-36, 1.79652915e-03, 6.16945525e-04,
2.91479827e-01, 3.61256782e-19, 7.16777694e-20,
5.99505777e-04, 1.19878601e-03, 2.26879982e-01,
2.29362220e-18, 3.83203573e-19, 3.64472061e-03,
9.40586935e-03, 2.36127633e-01, -2.14997426e-18,
-1.25096360e-18, -5.42056741e-03, 1.34513929e-02,
-2.10337635e-03, 1.85219810e-20, 1.17227090e-20,
4.99578328e-05, -1.19434015e-04, 4.35241934e-01,
-2.21838467e-35, -7.90349958e-35, 1.73104804e-02,
6.17098380e-03, 4.55150870e-01, 1.54174396e-33,
2.03584845e-33, 8.33976811e-02, 3.08625637e-02,
1.50333743e-01, 1.19436062e-19, -3.70099011e-21,
-5.50391286e-05, 3.98423101e-04, 1.17015703e-01,
7.58300515e-19, -1.97862272e-20, -3.34612699e-04,
3.12609222e-03, 1.21785275e-01, -7.10808690e-19,
6.45919085e-20, 4.97648760e-04, 4.47064416e-03,
-1.08483816e-03, 6.12360125e-21, -6.05287112e-22,
-4.58650389e-06, -3.96945494e-05, 2.24480540e-01,
-2.35571324e-34, 1.85387827e-37, -1.58923198e-03,
2.05096028e-03, 3.31984908e-01, -3.21137590e-34,
1.85694676e-34, -1.08279726e-02, 1.45060724e-02,
1.21074117e-01, 9.91974795e-34, 3.24015820e-35,
7.02927168e-04, 3.40908387e-03, -1.32999236e-03,
2.17403929e-21, -2.18662013e-22, 4.48558168e-07,
-3.65384295e-06, -1.03522994e-03, 1.38029929e-20,
-1.16901050e-21, 2.72702827e-06, -2.86686440e-05,
-1.07742601e-03, -1.29385212e-20, 3.81622116e-21,
-4.05574038e-06, -4.09992082e-05, 9.59748916e-06,
1.11465076e-22, -3.57615921e-23, 3.73791126e-08,
3.64029218e-07, -1.98596402e-03, -7.67521481e-37,
1.93122724e-38, 1.29519308e-05, -1.88088662e-05,
-2.93704783e-03, 8.58303591e-36, -1.83207822e-36,
8.82458661e-05, -1.33031720e-04, -1.51481287e-03,
1.71660718e-36, 1.24657332e-37, -8.10163264e-06,
-4.42138211e-05, 9.47625339e-06, 2.99079374e-38,
6.34013204e-39, 4.66879461e-08, 2.86713682e-07,
0.00000000e+00],
[ 7.93376086e-03, 8.90122094e-05, 7.47579654e-05,
1.16384995e-03, 4.18412970e-04, 5.55996090e-02,
7.80748554e-04, 6.80346872e-04, 1.10274857e-02,
4.07471822e-03, 1.94820374e-01, 3.42407131e-03,
3.09580300e-03, 5.22611967e-02, 1.98470986e-02,
1.66673235e-02, -9.61119773e-04, -5.24749884e-04,
-3.17217034e-03, 1.24175412e-04, 8.25930489e-02,
-5.96106863e-03, -3.37683836e-03, -2.09537864e-02,
9.56633109e-04, 1.75074392e-02, 5.18890175e-03,
1.84169298e-03, 6.75008315e-03, 8.41972741e-04,
-1.45254896e-04, 8.38160521e-06, 4.57369912e-06,
2.76196303e-05, -1.09819746e-06, -7.19794318e-04,
5.19844928e-05, 2.94323889e-05, 1.82436714e-04,
-8.44579542e-06, -2.15775703e-04, -6.39941293e-05,
-2.27011298e-05, -8.31721969e-05, -1.03633484e-05,
1.32969630e-06, 3.94616123e-07, 1.39909664e-07,
5.12409979e-07, 6.37785595e-08, 5.68770314e-02,
1.93011899e-21, -1.12751618e-20, 2.01740714e-03,
5.52437272e-04, 2.81847739e-01, 1.19710073e-20,
-7.25572316e-20, 1.35369634e-02, 3.74939114e-03,
8.44906559e-02, -1.47365906e-20, 5.59632158e-20,
-3.65118334e-03, -5.12432497e-04, -7.36331869e-04,
1.28512895e-22, -4.87773164e-22, 3.17864596e-05,
4.44161464e-06, 2.03875360e-01, 5.48848492e-36,
3.82622593e-36, 1.76004154e-03, 6.04415327e-04,
8.41156033e-02, 1.00914865e-19, -7.51677453e-20,
1.37452742e-02, 3.90728924e-03, 4.16825421e-01,
6.25895390e-19, -4.83714878e-19, 9.22318902e-02,
2.65187677e-02, 1.24953471e-01, -7.70491898e-19,
3.73088106e-19, -2.48767415e-02, -3.62434268e-03,
-1.08896329e-03, 6.71920304e-21, -3.25182109e-21,
2.16571853e-04, 3.14147398e-05, 4.26402146e-01,
-4.13967457e-35, 3.84789376e-35, 1.69589035e-02,
6.04565077e-03, 4.45906730e-01, 2.45884125e-34,
1.94995810e-34, 8.17038695e-02, 3.02357432e-02,
4.33834945e-02, -7.45555111e-20, 0.00000000e+00,
-1.26191930e-03, 1.29860899e-03, 2.14982032e-01,
-4.62409089e-19, 0.00000000e+00, -8.46757956e-03,
8.81365775e-03, 6.44460481e-02, 5.69236429e-19,
0.00000000e+00, 2.28387153e-03, -1.20457015e-03,
-5.61644108e-04, -4.96412117e-21, 0.00000000e+00,
-1.98829211e-05, 1.04408609e-05, 2.19921327e-01,
2.48380474e-35, 0.00000000e+00, -1.55695459e-03,
2.00930516e-03, 3.25242276e-01, -1.98704379e-34,
0.00000000e+00, -1.06080559e-02, 1.42114532e-02,
1.18615095e-01, 9.36701427e-35, 0.00000000e+00,
6.88650677e-04, 3.33984519e-03, -3.83810815e-04,
-4.11864902e-22, 0.00000000e+00, 1.02843963e-05,
-1.19092324e-05, -1.90193137e-03, -2.55447346e-21,
0.00000000e+00, 6.90091227e-05, -8.08279470e-05,
-5.70149790e-04, 3.14461671e-21, 0.00000000e+00,
-1.86131077e-05, 1.10468247e-05, 4.96882710e-06,
-2.74231542e-23, 0.00000000e+00, 1.62041931e-07,
-9.57506379e-08, -1.94562897e-03, -6.46824151e-38,
0.00000000e+00, 1.26888764e-05, -1.84268570e-05,
-2.87739622e-03, -2.58729661e-37, 0.00000000e+00,
8.64535876e-05, -1.30329838e-04, -1.48404693e-03,
0.00000000e+00, 0.00000000e+00, -7.93708802e-06,
-4.33158359e-05, 9.28379014e-06, 2.85858590e-39,
0.00000000e+00, 4.57397113e-08, 2.80890511e-07,
0.00000000e+00],
[ 1.34720454e-01, 3.57856518e-37, 0.00000000e+00,
1.46420052e-38, 2.03632423e-05, 1.06278297e-01,
-4.04868433e-36, 0.00000000e+00, 0.00000000e+00,
2.36581370e-04, 4.19204209e-02, 2.29028172e-35,
0.00000000e+00, 0.00000000e+00, 1.37430827e-03,
1.50508233e-01, -8.09736866e-36, 0.00000000e+00,
0.00000000e+00, 4.45033074e-04, 8.39568827e-02,
6.47789493e-35, 0.00000000e+00, 0.00000000e+00,
3.65603970e-03, 8.40730842e-02, 9.16112686e-35,
0.00000000e+00, 0.00000000e+00, 4.86303788e-03,
-1.35228100e-03, 0.00000000e+00, 0.00000000e+00,
-2.58836529e-39, -3.99552495e-06, -7.54332803e-04,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
-3.28240724e-05, -1.06826418e-03, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, -6.17453431e-05,
6.78688293e-06, 0.00000000e+00, 0.00000000e+00,
2.28781331e-40, 3.91986192e-07, 2.00950566e-01,
4.11924746e-22, 0.00000000e+00, -3.84126054e-22,
1.30421263e-04, 1.12094752e-01, -3.29539797e-21,
0.00000000e+00, 0.00000000e+00, 1.07143793e-03,
1.58745329e-01, -6.59079594e-21, 0.00000000e+00,
0.00000000e+00, 2.01548125e-03, -1.42628936e-03,
0.00000000e+00, 0.00000000e+00, 4.80157567e-23,
-1.80950722e-05, 1.49870078e-01, 7.11243144e-07,
1.12635733e-04, 1.07652759e-03, 7.04346138e-04,
3.47404712e-01, 3.28430648e-21, 0.00000000e+00,
-2.41649032e-21, 9.17093910e-04, 1.93790174e-01,
-2.62744518e-20, 0.00000000e+00, 0.00000000e+00,
7.53411811e-03, 2.74440009e-01, -5.25489037e-20,
0.00000000e+00, 0.00000000e+00, 1.41724251e-02,
-2.46577880e-03, 0.00000000e+00, 0.00000000e+00,
3.02061290e-22, -1.27240606e-04, 3.66417663e-01,
8.01971408e-06, 9.91353760e-04, 1.02330417e-02,
6.94565309e-03, 4.47927652e-01, 4.52136618e-05,
4.36265763e-03, 4.86433700e-02, 3.42495756e-02,
1.93671878e-01, 5.22157942e-21, 0.00000000e+00,
2.54222258e-21, 2.45197193e-04, 1.08034536e-01,
-4.17726353e-20, 0.00000000e+00, 0.00000000e+00,
2.01434618e-03, 1.52995369e-01, -8.35452707e-20,
0.00000000e+00, 0.00000000e+00, 3.78918540e-03,
-1.37462733e-03, 0.00000000e+00, 0.00000000e+00,
-3.17777822e-22, -3.40194599e-05, 2.04271257e-01,
1.27502029e-05, -1.17720223e-03, -2.55642308e-03,
1.19971756e-03, 3.53145545e-01, 1.01658311e-04,
-7.32636496e-03, -1.70569805e-02, 8.42323469e-03,
1.39209731e-01, 1.14284176e-04, 6.15171165e-03,
4.05692441e-03, 1.49012558e-03, -1.71209069e-03,
-4.86612597e-23, 0.00000000e+00, -2.15448032e-23,
-2.19673721e-06, -9.55042755e-04, 3.89290077e-22,
0.00000000e+00, 0.00000000e+00, -1.80466553e-05,
-1.35250378e-03, 7.78580154e-22, 0.00000000e+00,
0.00000000e+00, -3.39475525e-05, 1.21519277e-05,
0.00000000e+00, 0.00000000e+00, 2.69310040e-24,
3.04782500e-07, -1.80579091e-03, -1.18822464e-07,
1.06142749e-05, 2.18665850e-05, -1.05402991e-05,
-3.12186367e-03, -9.47380299e-07, 6.60583623e-05,
1.45911651e-04, -7.40314642e-05, -1.74038285e-03,
-1.50619971e-06, -7.84422824e-05, -4.89224426e-05,
-1.88341316e-05, 1.08790256e-05, 9.92542292e-09,
5.00120293e-07, 2.94986335e-07, 1.19099825e-07,
0.00000000e+00]]),
'has_data': array([1, 1, 1])}
Note that the same descriptor can be obtained not via python, but from the unix command line, using the quip
program. The input needs to be an extended XYZ file, and the descriptor specification is identical to the above.
So let’s write out an xyz file:
[31]:
ase.io.write("diamondH.xyz", at)
The following command will print all the descriptor vectors, each vector on a line, which starts with the string DESC
quip atoms_filename=diamondH.xyz descriptor_str="soap cutoff=3 l_max=4 n_max=4 atom_sigma=0.5 n_Z=2 Z={1 6} n_species=2 species_Z={1 6}"
General_monomer example: benzene
Now for a more interesting descriptor: a three-site benzene monomer
[32]:
desc_monomer = descriptors.Descriptor('general_monomer signature={6 6 6} atom_ordercheck=F')
Three intramolecular distances = dimensionality 3:
[33]:
desc_monomer.n_dim
[33]:
3
And a lot more permutations now:
[34]:
desc_monomer.n_perm
[34]:
6
[35]:
desc_monomer.permutations()
[35]:
array([[1, 2, 3],
[2, 1, 3],
[1, 3, 2],
[3, 1, 2],
[2, 3, 1],
[3, 2, 1]], dtype=int32)
Load some test configurations
[36]:
benzat = ase.io.read('benzene_frames.xyz', ':')
[37]:
len(benzat)
[37]:
40
[38]:
benzat[0].get_positions().shape
[38]:
(648, 3)
Calling calc
or any of the other methods on a list of Atoms
objects returns a list of results. Here we look at the results for the first configuration:
[39]:
res = desc_monomer.calc(benzat, grad=True)[0]
[40]:
res['grad_data'].shape
[40]:
(648, 3, 3)