pestifer.charmmff.pdbrepository module

Defines the PDBInput class for representing PDB files used as inputs for packmol. Defines the PDBCollection class for managing the collection of said PDBs.

class pestifer.charmmff.pdbrepository.PDBCollection(path_or_tarball: str | Path = '', streamID: str = '', info: dict = <factory>, contents: PDBInputDict = <factory>, registration_place: int = 0)[source]

Bases: object

A PDBCollection object is a collection of PDBInput objects.

Collection data can be initialized from a directory resident in the filesystem or a tarball containing _one_ such directory. If it is a tarball, the stream name is extracted from the name of the tarball in the format STREAM.tar.gz or STREAM.tgz, and it must be the case that the first directory in every member’s name is the same as the stream name (i.e., the tarball must contain a single top-level directory that is the stream name). The tarball may contain subdirectories with metadata and multiple conformers, but they must not be nested more than one level deep. The tarball may also contain PDB files with the name format RESI.pdb, which will be treated as solo entries with no metadata.

A single conceptual stream (e.g., prot) can be distributed across multiple collections in a repository. Because all CHARMM resnames are unique, this means that resnames can also be found across multiple collections in a repository. When a repository is queried for a resname, user-declared collections are searched first in the order they were registered, and then the base collection is searched last.

classmethod build_from_resources(path_or_tarball: str, resnames: list[str] = [], streamID_override: str = None)[source]
checkout(resname: str) PDBInput | None[source]

Given a resname, return the PDBInput object for that resname, or None if not found.

Parameters:

resname (str) – The resname to check out from the collection.

Returns:

The PDBInputManager object for the specified resname if found, or None if not found.

Return type:

PDBInputManager | None

contents: PDBInputDict

Mapping of residue names to their corresponding PDBInput objects.

info: dict

Contents of each entry’s info.yaml file (charge, conformerIDs, synonyms, etc.)

path_or_tarball: str | Path = ''

The path to the collection directory or tarball relative to the current working directory. If it ends with .tar.gz or .tgz, it is treated as a tarball; otherwise, it is treated as a directory.

registration_place: int = 0

The registration place of the collection in the repository indicating when it registered.

show(fullnames: bool = False, missing_fullnames: dict = None) str[source]

Return a string representation of the PDBCollection.

Parameters:
  • fullnames (bool, optional) – If True, show the full names of the residues (synonyms) instead of just the resnames. Defaults to False.

  • missing_fullnames (dict, optional) – A dictionary mapping resnames to their full names if they are not found in the collection. This is used to provide full names for residues that do not have a synonym in the metadata. Defaults to an empty dictionary.

Returns:

A string representation of the PDBCollection, including the stream ID, path, and a list of resnames. If fullnames is True, it will include the full names (synonyms) of the residues; otherwise, it will just list the resnames.

Return type:

str

streamID: str = ''

Name of the CHARMMFF stream corresponding to this collection.

class pestifer.charmmff.pdbrepository.PDBCollectionDict(dict=None, /, **kwargs)[source]

Bases: UserDict[str, PDBCollection]

class pestifer.charmmff.pdbrepository.PDBInput(name: str = '', pdbcontents: dict = <factory>, info: dict = <factory>, opt_tags: dict = <factory>)[source]

Bases: object

A PDBInput object represents the data needed to use a PDB file as input for packmol.

get_charge()[source]

Get the charge of the residue from the metadata. If not found, return 0.0.

Returns:

The charge of the residue, or 0.0 if not found.

Return type:

float

get_conformer_data(conformerID: int = 0)[source]

Get the conformer data for a given conformer ID. If no conformers are found, return None. If the conformer ID is out of range, return None.

Parameters:

conformerID (int) – The conformer ID to retrieve the data for. Defaults to 0.

Returns:

The conformer data for the specified conformer ID, or None if not found.

Return type:

dict | None

get_head_tail_length(conformerID: int = 0)[source]

Get the head-tail length for a given conformer ID. If no conformers are found, return 0.0. If the conformer ID is out of range, return 0.0.

Parameters:

conformerID (int) – The conformer ID to retrieve the head-tail length for. Defaults to 0.

Returns:

The head-tail length for the specified conformer ID, or 0.0 if not found.

Return type:

float

get_max_internal_length(conformerID: int = 0)[source]

Get the maximum internal length for a given conformer ID. If no conformers are found, return 0.0. If the conformer ID is out of range, return 0.0.

Parameters:

conformerID (int) – The conformer ID to retrieve the maximum internal length for. Defaults to 0.

Returns:

The maximum internal length for the specified conformer ID, or 0.0 if not found.

Return type:

float

get_parameters()[source]

Get the parameters for the residue from the metadata. If not found, return an empty list.

Returns:

The parameters for the residue, or an empty list if not found.

Return type:

list

get_pdb(conformerID=0, noh=False)[source]

Get the PDB contents for a given conformer ID. If noh is True, it will return the PDB contents for the noh tag if available.

Parameters:
  • conformerID (int) – The conformer ID to retrieve the PDB contents for. Defaults to 0.

  • noh (bool) – If True, it will return the PDB contents for the noh tag if available. Defaults to False.

Returns:

The PDB contents for the specified conformer ID and tag, or None if not found.

Return type:

str | None

get_ref_atoms()[source]

Get the reference atoms for the residue from the metadata. If not found, return an empty dictionary.

Returns:

The reference atoms for the residue, or an empty dictionary if not found.

Return type:

dict

info: dict

The metadata for the residue.

longname()[source]

Get the long name of the residue from the metadata. If not found, return the resname.

Returns:

The long name of the residue, or the resname if not found.

Return type:

str

name: str = ''

The name of the residue, which is also the base name of the PDB file.

opt_tags: dict

A dictionary containing optional tags for the residue.

pdbcontents: dict

A dictionary containing the PDB contents for each conformer ID. The keys are conformer IDs and the values are the PDB contents as strings.

class pestifer.charmmff.pdbrepository.PDBInputDict(dict=None, /, **kwargs)[source]

Bases: UserDict[str, PDBInput]

A dictionary mapping residue names to their corresponding PDBInput objects.

class pestifer.charmmff.pdbrepository.PDBRepository(*args, **kwargs)[source]

Bases: CacheableObject

A PDBRepository is a set of _collections_, each of which respresents a CHARMMFF _stream_. The base PDBRepository is the one that comes with pestifer, and it is located in PESTIFER/resources/charmmff/pdbrepository/. The base PDBRepository contains a lipid collection and a water_ions collection (as of v 1.13.1). A user may register additional collections by specifying them in the yaml config file.

add_collection(collection: PDBCollection, collection_key='generic')[source]

Add a PDBCollection to the repository. If the collection_key already exists, it will be overwritten.

Parameters:
  • collection (PDBCollection) – The PDBCollection object to add to the repository.

  • collection_key (str) – The key under which to register the collection in the repository. If it already exists, a warning will be logged and the collection will not be added again. If a collection with the same base name already exists, a numbered suffix will be added to the collection_key to avoid conflicts.

add_resource(path_or_tarball: str = '', streamID_override: str = '', resnames: list[str] = [])[source]

Add a new PDBCollection to the repository from a file path.

Parameters:
  • path (str) – The file path to the PDB collection.

  • streamID_override (str) – An optional override for the stream ID.

  • resnames (list[str]) – An optional list of resnames that the collection should minimally include. This is for building custom, right-sized collections for any particular build.

build_custom(charmmff_pdbrepository_path: str = '', streamID_override: str = '', resnames: list[str] = [], **kwargs)[source]

Build a custom collection that represents a user-defined set of residues.

Parameters:
  • charmmff_pdbrepository_path (str) – The path to the charmmff/pdbrepository directory.

  • streamID_override (str) – An optional override for the stream ID.

  • resnames (list[str]) – An optional list of resnames that the collection should minimally include. This is for building custom, right-sized collections for any particular build.

checkout(name: str) PDBInput | None[source]

Given a name, return the PDBInputManager object for that name, or None if not found. Search is conducted over collections in the order they were registered.

Parameters:

name (str) – The name of the residue to check out from the PDBRepository.

show(out_stream: Callable = <built-in function print>, fullnames: bool = False, missing_fullnames: dict = {})[source]

Show the contents of the PDBRepository, including all registered collections and their contents.

Parameters:
  • out_stream (callable, optional) – A callable that takes a string and outputs it. Defaults to # logger.debug.

  • fullnames (bool, optional) – If True, show the full names of the residues (synonyms) instead of just the resnames. Defaults to False.

  • missing_fullnames (dict, optional) – A dictionary mapping resnames to their full names if they are not found in the collection. This is used to provide full names for residues that do not have a synonym in the metadata. Defaults to an empty dictionary.