AtomsList and AtomsReader objects for I/O

There are two classes for reading trajectories: AtomsReader and AtomsList. Use an AtomsReader for quick read-only access to a trajectory or if you only want to access some of the frames. If you want to load the entire file into memory and manipulate it use an AtomsList.

Module contents for quippy.io:

Classes

AtomsReader(source[, format, start, stop, …]) An AtomsReader reads a series of Atoms objects
AtomsList([source, format, start, stop, …]) An AtomsList is just like an AtomsReader except that all frames are read in on initialiased and then stored in memory.

Functions

atoms_reader(source) Decorator to mark a function as a reader for a particular file extension
AtomsWriter(dest[, format]) Returns a file-like object for writing Atoms to dest which should be either a filename or an initiliased output object.
read_dataset(dirs, pattern, **kwargs) Read atomic configurations matching glob pattern from each of the directories in dir in turn.
time_ordered_series(source[, dt]) Given a source of Atoms configurations, return a time ordered list of filename and frame references
read(filename, **readargs) Read Atoms from file filename
write(filename, atoms, **writeargs) Write atoms to the file filename

Attributes

Name Value
AtomsReaders
AtomsWriters
class quippy.io.AtomsReader(source, format=None, start=None, stop=None, step=None, cache_mem_limit=-1, rename=None, **kwargs)[source]

An AtomsReader reads a series of Atoms objects from the trajectory source which should be one of the following:

  • a filename - in this case format is inferred from the file extension – see Supported File Formats
  • a shell-style glob pattern e.g. “*.xyz”
  • a list of filenames or glob patterns e.g. [“foo*.xyz”, “bar*.xyz”]
  • an open file or file-like object (e.g. a CInOutput object)
  • any Python iterator which yields a sequence of Atoms objects

start, stop and step can be used to restrict the range of frames read from source. The first frame in the file has index zero.

cache_limit determines how many configurations will be stored in memory. If more than cache_limit configurations are read in, the least recently accessed configurations are thrown away. To store everything, use an AtomsList instead.

Some sources understand additional keyword arguments from **kwargs. For example the CASTEP file reader can take an atoms_ref argument which is a reference Atoms object which is used to fill in information which is missing from the input file.

All AtomsReaders support iteration, so you can loop over the contents using a for loop:

al = AtomsReader('input-file.xyz')
for at in al:
   # process Atoms object `at`
   print at.energy

or using list comprehension:

print [at.energy for at in al]

In addition to iteration, some sources allow random access. To find out if an AtomsReader supports random access, either try to get it’s length with len(), or check if the random_access property is true. If cache_limit is large enough to store all the frames in the file, all AtomsReaders will allow random access once the entire trajectory has been loaded.

If randomaccess is true, you can access individual frames by indexing and slicing, e.g. al[i] is the ith Atoms object within al and al[i:j] returns objects from i upto but not including j. Like ordinary Python lists, indices start from 0 and run up to len(al)-1.

Attributes

random_access Read only property: True if this source supports random access, False if it does not

Methods

close() Close any open files associated with this AtomsReader
filter(at) Apply read-time filters to at
iterframes([reverse]) Return an interator over all the frames in this trajectory.
close()[source]

Close any open files associated with this AtomsReader

filter(at)[source]

Apply read-time filters to at

iterframes(reverse=False)[source]

Return an interator over all the frames in this trajectory. This is the default iterator for an AtomsReader instance al, and can be accessed with iter(al).

If reverse=True then the iteration starts with the last frame and goes backwards through the file. This is only possible if random_access is true.

random_access

Read only property: True if this source supports random access, False if it does not

class quippy.io.AtomsList(source=[], format=None, start=None, stop=None, step=None, rename=None, **kwargs)[source]

An AtomsList is just like an AtomsReader except that all frames are read in on initialiased and then stored in memory. This is equivalent to an AtomsReader with a cache_limit of None so an AtomsList always supports random access.

The AtomsList allows configurations to be added, removed or reordered using the standard Python methods for mutable sequence types (e.g. append(), extend(), index(), etc).

The attributes of the component Atoms can be accessed as a single array, using the frame number as the first array index. Note that the first index runs from 0 to len(al)-1, unlike the other indices which are one-based since the Atoms attributes are stored in a FortranArray.

For example the following statements are all true:

al.energy      ==  [at.energy for at in al] # energies of all atoms
al.energy[0]   ==  al[0].energy             # energy of first frame
all(al.velo[0] ==  al[0].velo)              # velocities of all atoms in first frame
al.velo[0,-1]  ==  al[0].velo[-1]           # velocity of last atom in first frame

In addition to the standard Python list methods and those of AtomsReader, AtomsList defined a couple of extras methods.

sort(cmp=None, key=None, reverse=False, attr=None)[source]

Sort the AtomsList in place. This is the same as the standard list.sort() method, except for the additional attr argument. If this is present then the sorted list will be ordered by the Atoms attribute attr, e.g.:

al.sort(attr='energy')

will order the configurations by their energy (assuming that Atoms.params contains an entry named energy for each configuration; otherwise an AttributError will be raised).

quippy.io.atoms_reader(source)[source]

Decorator to mark a function as a reader for a particular file extension

quippy.io.AtomsWriter(dest, format=None, **kwargs)[source]

Returns a file-like object for writing Atoms to dest which should be either a filename or an initiliased output object. If format is not given it is inferred from the file extension of dest. Example usage:

out = AtomsWriter('out_file.xyz')
for at in seq:
   out.write(at)
out.close()
quippy.io.read_dataset(dirs, pattern, **kwargs)[source]

Read atomic configurations matching glob pattern from each of the directories in dir in turn. All kwargs are passed along to AtomsList constructor.

Returns an dictionary mapping directories to AtomsList instances.

quippy.io.time_ordered_series(source, dt=None)[source]

Given a source of Atoms configurations, return a time ordered list of filename and frame references

quippy.io.read(filename, **readargs)[source]

Read Atoms from file filename

File format is inferred from file extension, see Supported File Formats.

quippy.io.write(filename, atoms, **writeargs)[source]

Write atoms to the file filename

File format is inferred from file extension, see Supported File Formats.

Supported File Formats

The AtomsReaders and AtomsWriters dictionaries are used by the Atoms constructor, Atoms.write(), the AtomsList constructor and AtomsList.write() to work out how to read or write a particular file type, based on the filename extension.

The quippy native formats are Extended XYZ and NetCDF.

The standard implementation of both these formats is in C in the files xyz.c and netcdf.c in the libAtoms package. It is this version which is wrapped by the CInOutput class, which is used by AtomsReader and AtomsList when reading from or writing to XYZ or NetCDF files.

quippy.io.AtomsReaders

Supported file formats for reading Atoms objects from files.

File extension Description
castep or castep_log CASTEP output
cell CASTEP cell files
chkpt IMD checkpoint
cp2k_output CP2K I/O and driver
cube Gaussian CUBE
geom CASTEP geometry
md CASTEP MD file
nc NetCDF
pos ASAP file format
POSCAR or CONTCAR, VASP coordinates
OUTCAR, VASP output
stdin Read from stdin in Extended XYZ format
string Read from string in Extended XYZ format
xyz Extended XYZ
quippy.io.AtomsWriters

Supported file formats for writing Atoms objects to files.

File extension Description
cell CASTEP cell file
cube Gaussian CUBE
dan DAN visualisation code
eps, jpg, png, Images (via:ref:atomeyewriter)
nc NetCDF
pos ASAP file format
pov POV-ray
POSCAR VASP coordinates
-, stdout Write to stdout in Extended XYZ format
string Write to string in Extended XYZ format
xyz Extended XYZ

Extended XYZ

Extended XYZ format is an enhanced version of the basic XYZ format that allows extra columns to be present in the file for additonal per-atom properties as well as standardising the format of the comment line to include the cell lattice and other per-frame parameters.

It’s easiest to describe the format with an example. Here is a standard XYZ file containing a bulk cubic 8 atom silicon cell

8
Cubic bulk silicon cell
Si        0.00000000      0.00000000      0.00000000
Si        1.36000000      1.36000000      1.36000000
Si        2.72000000      2.72000000      0.00000000
Si        4.08000000      4.08000000      1.36000000
Si        2.72000000      0.00000000      2.72000000
Si        4.08000000      1.36000000      4.08000000
Si        0.00000000      2.72000000      2.72000000
Si        1.36000000      4.08000000      4.08000000

The first line is the number of atoms, followed by a comment and then one line per atom, giving the element symbol and cartesian x y, and z coordinates in Angstroms.

Here’s the same configuration in extended XYZ format

8
Lattice="5.44 0.0 0.0 0.0 5.44 0.0 0.0 0.0 5.44" Properties=species:S:1:pos:R:3 Time=0.0
Si        0.00000000      0.00000000      0.00000000
Si        1.36000000      1.36000000      1.36000000
Si        2.72000000      2.72000000      0.00000000
Si        4.08000000      4.08000000      1.36000000
Si        2.72000000      0.00000000      2.72000000
Si        4.08000000      1.36000000      4.08000000
Si        0.00000000      2.72000000      2.72000000
Si        1.36000000      4.08000000      4.08000000

In extended XYZ format, the comment line is replaced by a series of key/value pairs. The keys should be strings and values can be integers, reals, logicals (denoted by T and F for true and false) or strings. Quotes are required if a value contains any spaces (like Lattice above). There are two mandatory parameters that any extended XYZ: Lattice and Properties. Other parameters – e.g. Time in the example above — can be added to the parameter line as needed.

Lattice is a Cartesian 3x3 matrix representation of the cell lattice vectors, with each vector stored as a column and the 9 values listed in Fortran column-major order, i.e. in the form

Lattice="R1x R1y R1z R2x R2y R2z R3x R3y R3z"

where R1x R1y R1z are the Cartesian x-, y- and z-components of the first lattice vector (\(\mathbf{a}\)), R2x R2y R2z those of the second lattice vector (\(\mathbf{b}\)) and R3x R3y R3z those of the third lattice vector (\(\mathbf{c}\)).

The list of properties in the file is described by the Properties parameter, which should take the form of a series of colon separated triplets giving the name, format (R for real, I for integer) and number of columns of each property. For example:

Properties="species:S:1:pos:R:3:vel:R:3:select:I:1"

indicates the first column represents atomic species, the next three columns represent atomic positions, the next three velcoities, and the last is an single integer called select. With this property definition, the line

Si        4.08000000      4.08000000      1.36000000   0.00000000      0.00000000      0.00000000       1

would describe a silicon atom at position (4.08,4.08,1.36) with zero velocity and the select property set to 1.

The extended XYZ format is now also supported by the ase.io.read() and ase.io.write() functions in the Atomic Simulation Environment (ASE) toolkit, and by the Ovito visualisation tool (from v2.4 beta onwards).

NetCDF

We use the NetCDF file format, a flexible binary file format designed for scientific array data.

We use a superset of the AMBER conventions, so that our trajectory file can be read directly by VMD. An important distinction from the Extended XYZ format is the names of some properties:

Extended XYZ name NetCDF name
pos coordinates
velo velocities

This mapping is handled automatically by, but if you access the data directly you’ll need to be aware of it.

NetCDF versions 3 and 4 are supported. If version 4 is used then it’s possible to use zlib compression, which greatly reduces the file size.

NetCDF Convention

All data is either per-atom (referred to as a property) or per-frame (refereed to as a parameter).

Dimensions

  • frame - number of frames (this is the unlimited dimension)
  • spatial - number of spatial dimensions (i.e. 3)
  • atom - number of atoms
  • cell_spatial - number of cell lengths (i.e. 3)
  • cell_angular - number of cell angles (i.e. 3)
  • label - length of string properies (per-atom character data, e.g. species, value is 10)
  • string - length of string parameters (per-frame string data, value is 1024)

Variables

Global variables

  • spatial (spatial) - character, equal to (‘x’,’y’,’z’)
  • cell_spatial (cell_spatial) - character, equal to (‘a’,’b’,’c’)
  • cell_angular (cell_angular, label) - character, equal to (‘alpha’, ‘beta’, ‘gamma’)

Parameters (per-frame variables)

  • cell_lengths (frame, cell_spatial) - double, cell lengths in Angstrom
  • cell_angles (frame, cell_angular) - double, cell angles in degrees

Other parameters can be of type double, integer, logical or string. Integer, logical and double types can be vectors or scalars (i.e. dimension (frame) or (frame,spatial)), but string parameters must be scalar (i.e. dimension (frame,string). Additionally, real and integer 3x3 matrices with dimension (frame,spatial,spatial) are supported (e.g. for the virial tensor). In order to distinguish between integer and logical variables, a special type attribute should be added to the variable, set to one of the following values:

T_INTEGER = 1
T_REAL = 2
T_LOGICAL = 4
T_INTEGER_A = 5
T_REAL_A = 6
T_LOGICAL_A = 8
T_CHAR = 9
T_INTEGER_A2 = 12
T_REAL_A2 = 13

Properties (per-atom variables)

Properties can be of type integer, real, string or logical. As for parameters, integer, real and logical properties can be scalar (dimension (frame,atom)) or vector (dimension (frame,atom,spatial)), but string properties must be of dimension (frame,atom,label). Again a type attribute must be added to the NetCDF variable, with one of the following values:

PROPERTY_INT     = 1
PROPERTY_REAL    = 2
PROPERTY_STR     = 3
PROPERTY_LOGICAL = 4

See the quippy.netcdf module for a reference implementation of the NetCDF reading and writing routines, in pure Python.

CInOutput objects

Module contents for quippy.cinoutput:

Classes

CInOutputReader(source[, frame, range, …]) Class to read atoms from a CInOutput.
CInOutputWriter(dest[, append, netcdf4, …]) Class to write atoms sequentially to a CInOutput stream

Functions

quip_getcwd()
Returns:
quip_dirname(*args, **kwargs) Routine is wrapper around Fortran interface quip_dirname containing multiple routines:
quip_chdir(*args, **kwargs) Routine is wrapper around Fortran interface quip_chdir containing multiple routines:
quip_basename(*args, **kwargs) Routine is wrapper around Fortran interface quip_basename containing multiple routines:

Attributes

Name Value
NETCDF_FORMAT 2
LATTICE_TOL 1e-08
XYZ_FORMAT 1
class quippy.cinoutput.CInOutputReader(source, frame=None, range=None, start=0, stop=None, step=1, no_compute_index=False, zero=False, one_frame_per_file=False, indices=None, string=False, format=None)[source]

Bases: object

Class to read atoms from a CInOutput. Supports generator and random access via indexing.

class quippy.cinoutput.CInOutputWriter(dest, append=False, netcdf4=False, one_frame_per_file=False, string=False, **write_kwargs)[source]

Bases: object

Class to write atoms sequentially to a CInOutput stream

quippy.cinoutput.quip_getcwd()
Returns:ret_quip_getcwd : Extendable_str object

References

Routine is wrapper around Fortran routine quip_getcwd defined in file src/libAtoms/CInOutput.f95.

quippy.cinoutput.quip_dirname(*args, **kwargs)

Routine is wrapper around Fortran interface quip_dirname containing multiple routines:

quippy.cinoutput.quip_dirname(path)
Parameters:path (Extendable_str object) –
Returns:ret_quip_dirname_extendable_strExtendable_str object

Routine is wrapper around Fortran routine quip_dirname_extendable_str defined in file src/libAtoms/CInOutput.f95.

quippy.cinoutput.quip_dirname(path)
Parameters:path (input string(len=-1)) –
Returns:ret_quip_dirname_charExtendable_str object

Routine is wrapper around Fortran routine quip_dirname_char defined in file src/libAtoms/CInOutput.f95.

quippy.cinoutput.quip_chdir(*args, **kwargs)

Routine is wrapper around Fortran interface quip_chdir containing multiple routines:

quippy.cinoutput.quip_chdir(path)
Parameters:path (Extendable_str object) –

Routine is wrapper around Fortran routine quip_chdir_extendable_str defined in file src/libAtoms/CInOutput.f95.

quippy.cinoutput.quip_chdir(path)
Parameters:path (input string(len=-1)) –

Routine is wrapper around Fortran routine quip_chdir_char defined in file src/libAtoms/CInOutput.f95.

quippy.cinoutput.quip_basename(*args, **kwargs)

Routine is wrapper around Fortran interface quip_basename containing multiple routines:

quippy.cinoutput.quip_basename(path)
Parameters:path (Extendable_str object) –
Returns:ret_quip_basename_extendable_strExtendable_str object

Routine is wrapper around Fortran routine quip_basename_extendable_str defined in file src/libAtoms/CInOutput.f95.

quippy.cinoutput.quip_basename(path)
Parameters:path (input string(len=-1)) –
Returns:ret_quip_basename_charExtendable_str object

Routine is wrapper around Fortran routine quip_basename_char defined in file src/libAtoms/CInOutput.f95.

ASAP file format

This module contains utility functions for use with the ASAP code, which is developed by Paul Tangney and Sandro Scandolo.

Module contents for quippy.asap:

AtomEye Image Writer

Module contents for quippy.atomeyewriter:

Classes

AtomEyeWriter(image[, width, height, …]) Write atoms to image file (png/eps/jpg) using AtomEye
class quippy.atomeyewriter.AtomEyeWriter(image, width=None, height=None, aspect=0.75, shift=True, commands=None, script=None, nowindow=True, number_frames=False)[source]

Write atoms to image file (png/eps/jpg) using AtomEye

CASTEP

Module contents for quippy.castep:

Classes

CastepCell([cellfile, xml, atoms]) Class to wrap a CASTEP cell (.cell) file
CastepParam([paramfile, xml, atoms]) Class to wrap a CASTEP parameter (.param) file

Functions

check_pspots(cluster, cell, param, orig_dir) Check pseudopotential files are present, and that we have one for each element present in cluster.
run_castep(cell, param, stem, castep[, …]) Invoke castep and return True if it completed successfully
read_formatted_potential(filename) Load a potential write by CASTEP pot_write_formatted() routine, and convert to a 3-dimensional FortranArray suitable for writing to a .cube file.
read_formatted_density(filename) Load a potential write by CASTEP pot_write_formatted() routine, and convert to a 3-dimensional FortranArray suitable for writing to a .cube file.
class quippy.castep.CastepCell(cellfile=None, xml=None, atoms=None)[source]

Class to wrap a CASTEP cell (.cell) file

Methods

read(cellfile) Read a CASTEP .cell file.
write([cellfile]) Write CASTEP .cell file.
read(cellfile)[source]

Read a CASTEP .cell file. cellfile can be a mapping type, filename or an open file

write(cellfile=<open file ‘<stdout>’, mode ‘w’>)[source]

Write CASTEP .cell file. cellfile can be a filename or an open file

class quippy.castep.CastepParam(paramfile=None, xml=None, atoms=None)[source]

Class to wrap a CASTEP parameter (.param) file

Methods

read(paramfile) Read a CASTEP .param file.
read_from_castep_output(castep_output) Read user parameters from .castep output.
write([paramfile]) Write CASTEP .param file
read(paramfile)[source]

Read a CASTEP .param file. paramfile can be a filename or an open file

read_from_castep_output(castep_output)[source]

Read user parameters from .castep output. Input should be filename, file-like object or list of lines

write(paramfile=<open file ‘<stdout>’, mode ‘w’>)[source]

Write CASTEP .param file

quippy.castep.check_pspots(cluster, cell, param, orig_dir)[source]

Check pseudopotential files are present, and that we have one for each element present in cluster. Also tries to check spin polarisation of system matches that specified in parameters.

quippy.castep.run_castep(cell, param, stem, castep, castep_log=None, save_all_check_files=False, save_all_input_files=False, test_mode=False, copy_in_files=None, subdir=None)[source]

Invoke castep and return True if it completed successfully

quippy.castep.read_formatted_potential(filename)[source]

Load a potential write by CASTEP pot_write_formatted() routine, and convert to a 3-dimensional FortranArray suitable for writing to a .cube file.

quippy.castep.read_formatted_density(filename)[source]

Load a potential write by CASTEP pot_write_formatted() routine, and convert to a 3-dimensional FortranArray suitable for writing to a .cube file.

Gaussian CUBE

Module contents for quippy.cube:

DAN visualisation code

Module contents for quippy.dan:

IMD checkpoint

Module contents for quippy.imd:

POV-ray

Module contents for quippy.povray:

VASP

Module contents for quippy.vasp:

Classes

Atoms([symbols, positions, numbers, tags, …) Representation of an atomic configuration and its associated properties
VASPWriter(out[, species_list]) Writer for VASP POSCAR format

Functions

ASEReader(source[, format]) Helper routine to load from ASE trajectories
VASP_POSCAR_Reader(outcar[, species, format]) Read a configuration from a VASP OUTCAR file.
atoms_reader(source) Decorator to mark a function as a reader for a particular file extension
frange(min[, max, step]) Fortran equivalent of range() builtin.
fzeros(shape[, dtype]) Create an empty FortranArray with Fortran ordering.

Attributes

Name Value
AtomsReaders
AtomsWriters
ase
np
quippy
re <module ‘re’ from ‘/home/travis/virtualenv/python2.7.13/lib/python2.7/re.pyc’>
sys <module ‘sys’ (built-in)>
class quippy.vasp.VASPWriter(out, species_list=None)[source]

Writer for VASP POSCAR format

quippy.vasp.VASP_POSCAR_Reader(outcar, species=None, format=None)[source]

Read a configuration from a VASP OUTCAR file.

ASE supported files types

Since quippy.atoms.Atoms is a subclass of the ASE ase.atoms.Atoms class, all of the ASE I/O formats can also be used with quippy: see ase.io.read() for a list of supported formats. To convert from ase.atoms.Atoms to quippy.atoms,Atoms, simply pass the ASE Atoms object to the quippy Atoms constructor, e.g.:

from quippy.atoms import Atoms as QuippyAtoms
from ase.atoms import Atoms as ASEAtoms
from ase.io import read

ase_atoms = read(filename)
quippy_atoms = QuippyAtoms(ase_atoms)

Similarily, to use one of the quippy file formats with other ASE tools:

from quippy.io import read
quippy_atoms = read(filename)
ase_atoms = ASEAtoms(quippy_atoms)

Adding a new file type

To add support for a new file format, implement routines which read from or write to files following the templates below.

def sample_reader(filename):
    # insert code to open `filename` for reading

    while True:
       # determine if more frame are available
       if more_frames:
          # read next frame from `filename` into new Atoms object
          at = Atoms()
          yield at
       else:
          break


class sample_writer(object):
    def __init__(self, filename):
       # insert code to open `filename` for writing
       pass

    def write(self, at):
       # insert code to write `at` to `filename`
       pass

    def close(self):
       # insert code to close `filename`
       pass

sample_reader() is a generator which yields a succession of Atoms objects, raising StopIteration when there are no more available - for a file format which only permits one configuration per file, a simplified implementation template would be:

def sample_reader(filename):
   # insert code to open `filename` for reading
   # insert code to read from `filename` into new Atoms object
   yield at

To register the new file format, you just need to set entries in AtomsReaders and AtomsWriters:

from quippy import AtomsReaders, AtomsWriters
AtomsReaders['new_format'] = sample_reader
AtomsWriters['new_format'] = sameple_writer

For the case of reading, there is a generator atoms_reader() which can be used to simplify the registration process:

@atoms_reader('new_format')
def sample_reader(filename):
   ...

See the code in quippy.xyz quippy.netcdf and quippy.castep for full examples.