pestifer.molecule.residue module

Defines the ResiduePlaceholder and Residue classes for handling residues in a molecular structure.

class pestifer.molecule.residue.Residue(*args, resname: str, resid: ResID, chainID: str, resolved: bool, segtype: str, segname: str, model: int = 1, obj_id: int | None = None, auth_asym_id: str | None = None, auth_comp_id: str | None = None, auth_seq_id: int | None = None, pdb_ins_code: str | None = None, empty: bool = False, recordname: str = 'REMARK.465', asym_chainID: str | None = None, ORIGINAL_ATTRIBUTES: dict = <factory>)[source]

Bases: ResiduePlaceholder

A class for handling residues in a molecular structure. This class extends the ResiduePlaceholder class to include additional functionality for managing residues.

add_atom(a: Atom)[source]

Add an atom to this residue if it matches the residue’s resid, residue name, and chain ID. This method is used to build a residue from its constituent atoms, ensuring that all atoms in the residue share the same resid, residue name, and chain ID.

Parameters:

a (Atom) – The atom to be added to the residue.

atoms: AtomList
down: ResidueList
get_down_group()[source]

Get the downstream group of residues. This method traverses the downstream links of this residue and collects all residues in the downstream direction.

Link this residue to another residue in a downstream direction. This method establishes a connection between this residue and another residue, allowing for traversal of the molecular structure in a downstream direction. It also updates the upstream and downstream links for both residues to maintain the connectivity information.

Parameters:
  • other (Residue) – The other residue to link to.

  • link (Link) – A link object representing the connection between the two residues.

model_config = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': False}

Configuration for pydantic.BaseModel.

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:
  • self – The BaseModel instance.

  • context – The context.

ri()[source]

Returns the residue sequence number and insertion code as a string.

same_resid(other)[source]

Check if this residue has the same residue sequence number and insertion code as another residue or a string representation of a residue. This is used to determine if two residues are the same in terms of their sequence position.

Parameters:

other (Residue, str) – The other residue or string to compare with.

Returns:

True if this residue has the same residue sequence number and insertion code as the other residue, False otherwise.

Return type:

bool

set_chainID(chainID)[source]

Set the chain ID for this residue and all its constituent atoms. This method updates the chain ID of the residue and all atoms within it to ensure consistency across the residue’s structure.

Parameters:

chainID (str) – The new chain ID to be set for the residue and its atoms.

set_segname(segname: str)[source]

Unlink this residue from another residue in a downstream direction. This method removes the connection between this residue and another residue, effectively severing the link between them. It updates the upstream and downstream links for both residues to reflect the removal of the connection.

Parameters:
  • other (Residue) – The other residue to unlink from.

  • link (Link) – A link object representing the connection between the two residues.

up: ResidueList
class pestifer.molecule.residue.ResidueList(initlist: Iterable[T] = ())[source]

Bases: BaseObjList[Residue]

A class for handling lists of Residue objects. This class extends the AncestorAwareObjList to manage collections of residues in a molecular structure. It provides methods for initializing the list from various input types, indexing residues, and performing operations such as mapping chain IDs, retrieving residues and atoms, and handling residue ranges.

apply_insertions(insertions: InsertionList)[source]

Apply a list of insertions to the residue list.

Parameters:

insertions (list of Insertion) – A list of insertions to apply. Each insertion should contain the chain ID, residue sequence numbers, insertion codes, and the sequence of residues to insert.

apply_segtypes()[source]

Apply segment types to residues based on their residue names. This method uses the segtype_of_resname mapping to assign segment types to residues based on their residue names. It updates the segtype attribute of each residue in the residue list.

atom_resids(as_type=<class 'pestifer.objs.resid.ResID'>)[source]

Get a list of residue IDs for all atoms in the residues.

Parameters:

as_type (type, optional) – The type to which the residue IDs should be converted. Default is str.

Returns:

A list of residue IDs.

Return type:

list

atom_resseqnums(as_type=<class 'str'>)[source]

Get a list of residue sequence numbers for all atoms in the residues.

Parameters:

as_type (type, optional) – The type to which the residue sequence numbers should be converted. Default is str.

Returns:

A list of residue sequence numbers.

Return type:

list

atom_serials(as_type=<class 'str'>)[source]

Get a list of atom serial numbers for all atoms in the residues.

Parameters:

as_type (type, optional) – The type to which the serial numbers should be converted. Default is str.

Returns:

A list of atom serial numbers.

Return type:

list

caco_str(upstream_reslist: ResidueList, seglabel: str, tmat: ndarray) str[source]

Generate a string representation of the caco command to position the N atom of a model-built residue based on the coordinates of the previous residue. Uses the positionN() function to calculate the position of the N atom based on the upstream residue’s coordinates and the transformation matrix.

Parameters:
  • upstream_reslist (list) – A list of residues that are upstream of the current residue.

  • seglabel (str) – The segment label for the residue.

  • molid_varname (str) – The variable name for the molecule ID. (unused for now)

  • tmat (numpy.ndarray) – A transformation matrix to apply to the coordinates of the residue.

Returns:

A string representing the VMD/psfgen caco command for positioning the residue.

Return type:

str

cif_residue_map()[source]

Create a mapping of residues by chain ID and residue sequence number. This method iterates through the residue list and creates a dictionary where the keys are chain IDs and the values are dictionaries mapping residue sequence numbers to Namespace objects containing the residue information.

deletion(DL: DeletionList)[source]

Remove residues from the residue list based on a DeletionList. This method iterates through the DeletionList, retrieves the residues to be deleted based on their chain ID, residue sequence numbers, and insertion codes, and removes them from the residue list

Parameters:

DL (DeletionList) – A list of deletion ranges to apply. Each deletion range should contain the chain ID, residue sequence numbers, and insertion codes for the start and end of the deletion range.

describe()[source]

Returns a string description of the ResidueList object, including the number of residues it contains.

Returns:

A string representation of the ResidueList object.

Return type:

str

do_deletions(Deletions: DeletionList)[source]

Apply a list of deletions to the residue list.

Parameters:

Deletions (list of Deletion) – A list of deletion ranges to apply. Each deletion range should contain the chain ID, residue sequence numbers, and insertion codes for the start and end of the deletion range.

classmethod from_ResiduePlaceholderlist(input_list: ResiduePlaceholderList)[source]

Create a ResidueList from an ResiduePlaceholderList.

Parameters:

input_list (ResiduePlaceholderList) – A list of ResiduePlaceholder objects to convert into a ResidueList.

Returns:

An instance of ResidueList initialized with the residues created from the empty residues.

Return type:

ResidueList

classmethod from_residuegrouped_atomlist(atoms: AtomList)[source]

Construct a list of residues from a list of Atoms ordered and grouped into residues

map_chainIDs_label_to_auth()[source]

Create a mapping from chain IDs in the label (e.g., PDB format) to the author chain IDs.

puniquify(attrs: list[str], stash_attr_name='ORIGINAL_ATTRIBUTES')[source]

Systematic attribute altering to make all elements unique

There may be a set of attributes for which no two elements may have the exact same set of respective values. This method scans the calling instance for such collisions and, if any is found, it adds one to the value of the attribute named in the first element of the ‘fields’ list (assumes this attribute is numeric!). This could lead to other collisions so multiple passes through the calling instance are made until there are no more collisions. Each such value change results in storing the original values in a new attribute.

Parameters:
  • fields (list, optional) – attribute names used to build the hash to test for uniqueness; if unset, all attributes are used

  • stash_attr_name (str, optional) – name given to a new dict attribute used to store all original attribute name:value pairs

remap_chainIDs(the_map: dict)[source]

Remap the chain IDs of residues in the list according to a provided mapping.

Parameters:

the_map (dict) – A dictionary mapping old chain IDs to new chain IDs.

renumber(links: LinkList)[source]

The possibility exists that empty residues added have resids that conflict with existing resids on the same chain if those resids are in a different segtype (e.g., glycan). This method will privilege protein residues in such conflicts, and it will renumber non-protein residues, updating any resid records in links to match the new resid.

Parameters:

links (LinkList) – A list of links that may contain residue sequence numbers that need to be updated.

resrange(rngrec)[source]

Yield residues from specified residue range.

Parameters:

rngrec (argparse.Namespace) –

A record defining the range of residues to retrieve. It must have the following attributes:

  • chainID: The chain ID of the residues.

  • resid1: The starting resid.

  • resid2: The ending resid.

For example, a Deletion object can be used to define the range

Yields:

Residue – Yields residues within the specified range.

set_chainIDs(chainID)[source]

Set the chain ID for all residues in the list to a specified value.

Parameters:

chainID (str) – The chain ID to set for all residues.

set_segname(segname: str)[source]

Set the segment name for the residue.

Parameters:

segname (str) – The segment name to set.

state_bounds(state_func: Callable)[source]

Get the state bounds for each residue in the list based on a state function.

Parameters:

state_func (function) – A function that takes a residue and returns its state bounds.

Returns:

A list of state intervals for each residue in the residue list.

Return type:

StateIntervalList

substitutions(SL: SubstitutionList)[source]

Apply a list of substitutions to the residue list. This method iterates through the SubstitutionList, retrieves the residues to be substituted based on their chain ID, residue sequence numbers, and insertion codes, and replaces their residue names with the corresponding residue names from the substitution list. It also creates a new SeqadvList for any resolved residues that are substituted. Any residues that are deleted as a result of the substitutions are also returned in a list.

Parameters:

SL (SubstitutionList) – A list of substitutions to apply. Each substitution should contain the chain ID, residue sequence numbers, insertion codes, and the new residue sequence to substitute.

Returns:

tuple – A tuple containing a SeqadvList of new sequence advancements for resolved residues and a list of residues that were deleted.

Return type:

(SeqadvList, list of Residue)

class pestifer.molecule.residue.ResiduePlaceholder(*args, resname: str, resid: ResID, chainID: str, resolved: bool, segtype: str, segname: str, model: int = 1, obj_id: int | None = None, auth_asym_id: str | None = None, auth_comp_id: str | None = None, auth_seq_id: int | None = None, pdb_ins_code: str | None = None, empty: bool = False, recordname: str = 'REMARK.465', asym_chainID: str | None = None, ORIGINAL_ATTRIBUTES: dict = <factory>)[source]

Bases: BaseObj

A class for handling missing residues in a molecular structure. This class represents residues that are not present in the structure, such as those that are missing due to low resolution or other reasons. It is used to track residues that are expected to be present but are not resolved in the coordinate file.

ORIGINAL_ATTRIBUTES: dict
asym_chainID: str | None
auth_asym_id: str | None
auth_comp_id: str | None
auth_seq_id: int | None
chainID: str
empty: bool
model: int
model_config = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': False}

Configuration for pydantic.BaseModel.

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:
  • self – The BaseModel instance.

  • context – The context.

obj_id: int | None
pdb_ins_code: str | None
pdb_line()[source]

Returns a PDB line representation of the ResiduePlaceholder. The line is formatted according to the PDB standard for missing residues.

Returns:

A string representing the ResiduePlaceholder in PDB format.

Return type:

str

recordname: str
resid: ResID
resname: str
resolved: bool
segname: str
segtype: str
shortcode()[source]
class pestifer.molecule.residue.ResiduePlaceholderList(initlist: Iterable[T] = ())[source]

Bases: BaseObjList[ResiduePlaceholder]

A class for handling lists of ResiduePlaceholder objects. This class is used to manage collections of residues that are not present in the molecular structure, such as missing residues in a PDB file.

This class does not add anything beyond the BaseObjList class.

apply_exclusion_logics(exclusion_logics: list[str] = []) int[source]
apply_inclusion_logics(inclusion_logics: list[str] = [])[source]
describe()[source]

Abstract method to describe the contents of the BaseObjList. Subclasses should implement this method to provide a meaningful description.

classmethod from_cif(cif_data: DataContainer) ResiduePlaceholderList[source]

Create an ResiduePlaceholderList from CIF data.

Parameters:

cif_data (DataContainer) – A DataContainer instance containing CIF data.

Returns:

A new instance of ResiduePlaceholderList containing the missing residues.

Return type:

ResiduePlaceholderList

classmethod from_pdb(parsed: PDBRecordDict) ResiduePlaceholderList[source]

Create an ResiduePlaceholderList from parsed PDB data.

Parameters:

parsed (PDBRecordDict) – A dictionary containing parsed PDB records.

Returns:

A new instance of ResiduePlaceholderList containing the missing residues.

Return type:

ResiduePlaceholderList