fetch¶
A fetch task allows you to retrieve a structure file from a remote source, such as the PDB or AlphaFold databases.
This task is typically the first step in a pestifer run, injecting the first representation of the system state (the downloaded structure file) into the workflow.
The fetch task currently supports two sources:
RCSB: The Research Collaboratory for Structural Bioinformatics (RCSB) is a repository for 3D structural data of biological macromolecules. You can specify a PDB ID to fetch a structure from the RCSB.
AlphaFold: AlphaFold is a deep learning model for predicting protein structures. You can specify a UniProt ID to fetch a structure from the AlphaFold database.
A fetch task has three attributes that can be set in the config file:
source: eitherrcsboralphafold(case-insensitive);rcsbby default.sourceID: the ID of the structure in the source database (a PDB ID or a UniProt ID); this attribute is required.source_format: eitherpdborcif. Some structures in the RCSB are only available in the new mmCIF/PDBx format; in this case you should specifycifhere. Default ispdb.
For example, to begin a run using the bovine pancreatic trypsin inhibitor (BPTI) structure PDB ID 6pti from the RCSB, you would specify the following in your config file:
tasks:
- fetch:
sourceID: 6pti
If the structure file has already been downloaded (i.e., Pestifer detects it in your CWD), no download is performed.