chemlab.db

AbstractDB

class chemlab.db.base.AbstractDB

Interface for a generic database.

A typical database can be used to retrieve molecules by calling the get method:

water = db.get("molecule", "example.water")

A database can also provide custom functionalities to store or search for entries. Those are implemented in custom methods.

See the other implementations for more relevant examples.

get(feature, key, *args, **kwargs)

Get a data entry from the database.

Subclasses are required to implement this method to provide access to the database.

Parameters

  • feature: str

    An identifier that represents the kind of data that we want to extract. Examples of such identifier are “system”, “molecule”, “data” etc.

  • key: str

    The key associated with the database entry. By convention you can use dotted names to provide some kind of nested structure.

  • args, kwargs:

    Custom extra arguments.

ChemlabDB

class chemlab.db.ChemlabDB

Chemlab default database.

This database contains some example molecules and some atomic data.

get(self, 'molecule', key)

Retrieve a molecule from the database. The included molecule keys are:

  • example.water
  • example.norbornene
  • gromacs.spc
  • gromacs.spce
  • gromacs.na+
  • gromacs.cl-
get(self, 'data', key)

Retrieve atomic data. The available data is:

  • symbols: Atomic symbols in a list.
  • vdwdict: Dictionary with per-element Van Der Waals radii.
  • massdict: Dictionary of masses.
  • paulingenegdict: Dictionary with per-element Pauling electronegativity
  • arenegdict: Dictionary with per-element Allred-Rochow electronegativity
  • maxbonddict: Dictionary of maximum bond valences. 6 if unknown.
  • ionpotdict: Dictionary of ionisation potentials in eV
  • eaffdict: Dictionary of electron affinities in eV

Data was taken from the OpenBabel distribution.

ChemSpiderDB

class chemlab.db.ChemSpiderDB(token=None)

Retrieve data from the online Chemspider database by passing an string identifier.

Parameters

token: str | None

The chemspider security token. When token is None, chemlab will try to retrieve the token from a configuration file in $HOME/.chemlabrc that has the entry:

[chemspider]
token=YOUR-SECURITY-TOKEN

The get method requires a key argument to retrieve a database entry. A valid key can be, for instance, the common name of a certain chemical, a SMILES string or an InChi identifier. This is just an adapter on the chemspipy library.

get(self, 'molecule', key)

Retrieve a molecule 3D structure. Returns a Molecule instance.

get(self, 'inchi', key)

Retrieve the InChi string for the compound.

get(self, 'molecularformula', key)

Retrieve the molecular formula as a LaTeX string.

get(self, 'imageurl', key)

Retrieve the url of a 2D image representation of the compound.

get(self, 'smiles', key)

Retrieve the SMILES string for the compound.

get(self, 'averagemass', key)

Retrieve the average mass

get(self, 'nominalmass', key)

Retrieve the nominal mass

get(self, 'inchikey', key)

Return the InChi key.

get(self, 'alogp', key)

Predicted LogP (partition coefficient) using the ACD LogP algorithm.

get(self, 'xlogp', key)

Predicted LogP using the XLogP algorithm.

get(self, 'image', key)

PNG image of the compound as a data string.

get(self, 'mol2d', key)

MOL mdl file containing 2D coordinates of the compound.

get(self, 'commonname', key)

Retrieve the common name of the compound.

CirDB

class chemlab.db.CirDB

Get 3D structure of arbitrary molecules given a string identifier.

get(self, 'molecule', key)

Retrieve a molecule from the online CIR database by passing an identifier.

A key can be, for instance, the common name of a certain chemical, a SMILES string or an InChi identifier. This is just an adapter on the CirPy library.

Returns a Molecule instance.

LocalDB

class chemlab.db.LocalDB(directory)

Store serialized molecules and systems in a directory tree.

See Having your own molecular database for an example of usage.

directory

Directory where the database is located.

get(self, 'molecule', key)

Get an entry from the database. Key is the filename without extension of the serialized molecule. Molecules are stored in the subdirectory.

get(self, 'system', key)

Get an entry from the database. Key is the filename without extension of the serialized system.

store(self, 'molecule', key, value)
store(self, 'system', key, value)

Store a Molecule or a System passed as value in the directory structure. The objects are dumped to disk after being serialized to json.

RcsbDB

class chemlab.db.RcsbDB

Access to the RCSB database for proteins.

To download a protein, just write its PDB id that you can check on the website:

from chemlab.db import RcsbDB
mol = RcsbDB().get('molecule', '3ZJE')
get(self, 'molecule', key)

The 4 alphanumeric PDB entry that you can get from the RCSB website.