Example 15: Fully Glycosylated, Closed SARS-CoV-2 Omicron BA.2 Variant Spike¶
This example highlights the use of Pestifer to build a fully glycosylated SARS-CoV2 Spike protein (BA.2 strain) using grafted glycans and cleaving at the furin cleavage sites. This build is based on the PDB entry 7xix, which contains a spike protein in the closed conformation. The PDB file contains glycans, but they are not fully resolved, so we graft glycans from prototypical structures.
The glycans taken from prototypical structures are the following:
PDB ID 2wah chain C is a “high-mannose” glycan with 9 mannoses; its full name is alpha-D-mannopyranose-(1-2)-alpha-D-mannopyranose-(1-6)-[alpha-D-mannopyranose-(1-3)]alpha-D-mannopyranose-(1-6)-[alpha-D-mannopyranose-(1-2)-alpha-D-mannopyranose-(1-3)]beta-D-mannopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose
PDB ID 4b7i chain C is an “intermediate” glycan with 5 mannoses and a fucose; its full name is alpha-D-mannopyranose-(1-3)-[alpha-D-mannopyranose-(1-6)]alpha-D-mannopyranose-(1-6)-[alpha-D-mannopyranose-(1-3)]beta-D-mannopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-4)-[alpha-L-fucopyranose-(1-6)]2-acetamido-2-deoxy-beta-D-glucopyranose
PDB ID 4byh chain C is a “complex” glycan; its full name is N-acetyl-alpha-neuraminic acid-(2-6)-beta-D-galactopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-2)-alpha-D-mannopyranose-(1-6)-[2-acetamido-2-deoxy-beta-D-glucopyranose-(1-2)-alpha-D-mannopyranose-(1-3)]beta-D-mannopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-4)-[alpha-L-fucopyranose-(1-6)]2-acetamido-2-deoxy-beta-D-glucopyranose
High-mannose glycan from PDB ID 2wah chain C. Green circles denote mannoses, either α or β, and blue circles denote N-acetylglucosamines.¶
Intermediate glycan from PDB ID 4b7i chain C. Green circles denote mannoses, either α or β, blue circles denote N-acetylglucosamines, and the red triangle denotes fucose.¶
Complex glycan from PDB ID 4byh chain C. Green circles denote mannoses, either α or β, blue circles denote N-acetylglucosamines, red triangle denotes fucose, yellow circles denote galactose, and the purple diamond denotes sialic acid.¶
The script below shows the use of graft modifications to include the glycans. The glycan assignments (i.e., which asparagines have high-mannose, intermediate, and complex glycans) are taken from Watanabe et al. (2020). The commented-out integer labels on each graft directive indicate the residue numbers in the PDB file to which the glycans are grafted.
The cleave task is used to cleave each protomer at its furin cleavage site (residue 685).
# Author: Cameron F. Abrams, <cfa22@drexel.edu>
#
# pestifer input script
#
# BA.2 SARS-CoV-2 Spike
#
# Notes:
# - Glycans are grafted from prototypical structures
# - 2wah chain C is a poorly processed, high-mannose glycan
# - 4b7i chain C is an intermedately processed glycan
# - 4byh chain C is a complex glycan
# - Chains are cleaved at the furin cleavage sites
#
title: BA.2 SARS-CoV-2 Spike 7xix, fully glycosylated using grafts, and cleaved
tasks:
- fetch:
sourceID: 7xix
- psfgen:
source:
biological_assembly: 1
sequence:
loops:
declash:
maxcycles: 20
glycans:
declash:
maxcycles: 500
mods:
mutations: # undo the stabilizing proline mutations
- A:PRO,986,LYS
- A:PRO,987,VAL
- B:PRO,986,LYS
- B:PRO,987,VAL
- C:PRO,986,LYS
- C:PRO,987,VAL
grafts:
- A_1304:4b7i,C_1-8 # 61
- B_1304:4b7i,C_1-8
- C_1304:4b7i,C_1-8
- A_1305:4b7i,C_1-8 # 122
- B_1305:4b7i,C_1-8
- C_1305:4b7i,C_1-8
- A_1306:4b7i,C_1-8 # 165
- B_1306:4b7i,C_1-8 # 165
- C_1306:4b7i,C_1-8 # 165
- A_1301:2wah,C_1-9 # 234
- B_1301:2wah,C_1-9 # 234
- C_1301:2wah,C_1-9 # 234
- A_1307:4byh,C_1-10 # 282
- B_1307:4byh,C_1-10 # 282
- C_1307:4byh,C_1-10 # 282
- A_1302:4byh,C_1-10 # 331
- B_1302:4byh,C_1-10 # 331
- C_1302:4byh,C_1-10 # 331
- A_1303:4byh,C_1-10 # 343
- B_1303:4byh,C_1-10 # 343
- C_1303:4byh,C_1-10 # 343
- A_1308:4b7i,C_1-8 # 603
- B_1308:4b7i,C_1-8 # 603
- C_1308:4b7i,C_1-8 # 603
- D_1-2:4byh,C_1#2-10 # 616
- J_1-2:4byh,C_1#2-10 # 616
- P_1-2:4byh,C_1#2-10 # 616
- A_1309:4b7i,C_1-8 # 657
- B_1309:4b7i,C_1-8 # 657
- C_1309:4b7i,C_1-8 # 657
- E_1-2:2wah,C_1#2-9 # 709
- K_1-2:2wah,C_1#2-9 # 709
- Q_1-2:2wah,C_1#2-9 # 709
- F_1-2:4b7i,C_1#2-8 # 717
- L_1-2:4b7i,C_1#2-8 # 717
- R_1-2:4b7i,C_1#2-8 # 717
- G_1-2:2wah,C_1#2-9 # 801
- M_1-2:2wah,C_1#2-9 # 801
- S_1-2:2wah,C_1#2-9 # 801
- A_1310:4b7i,C_1-8 # 1074
- B_1310:4b7i,C_1-8 # 1074
- C_1310:4b7i,C_1-8 # 1074
- H_1-2:4byh,C_1#2-10 # 1098
- N_1-2:4byh,C_1#2-10 # 1098
- T_1-2:4byh,C_1#2-10 # 1098
- I_1-2:2wah,C_1#2-9 # 1134
- O_1-2:2wah,C_1#2-9 # 1134
- U_1-2:2wah,C_1#2-9 # 1134
- validate:
tests:
- connection_test:
name: glycans
selection: protein and chain A B C and resid 61 122 165 234 282 331 343 603 616 657 709 717 801 1074 1098 1134
connection_type: glycosylation
connection_count: 48
- attribute_test:
name: point mutation 986
selection: protein and chain A B C and resid 986 and name CA
attribute: resname
value: LYS
value_count: 3
- attribute_test:
name: point mutation 987
selection: protein and chain A B C and resid 987 and name CA
attribute: resname
value: VAL
value_count: 3
- md:
cpu-override: True
ensemble: minimize
- ligate:
steer:
nsteps: 4000
- md:
cpu-override: True
ensemble: minimize
- cleave:
sites:
- A:685-686
- B:685-686
- C:685-686
- validate:
tests:
- connection_test:
name: cleavages
selection: protein and chain A B C and resid 685 686
connection_type: interresidue
connection_count: 0
- md:
cpu-override: True
ensemble: minimize
- md:
cpu-override: True
ensemble: NVT
- solvate:
- md:
ensemble: minimize
- md:
ensemble: NVT
- md:
ensemble: NPT
nsteps: 200
- md:
ensemble: NPT
nsteps: 400
- md:
ensemble: NPT
nsteps: 800
- md:
ensemble: NPT
nsteps: 1600
- md:
ensemble: NPT
nsteps: 13200
- mdplot:
timeseries:
- density
basename: solvated
grid: True
- terminate:
basename: my_7xix
artifacts: artifacts
package:
basename: prod_7xix
namd:
ensemble: NPT
nsteps: 5000000
Step |
Task |
Details |
|---|---|---|
1 |
|
|
2 |
|
mutations, 48 glycan graft(s) |
3 |
|
3 test(s) |
4 |
|
minimize |
5 |
|
ligate chain breaks (steered MD, 4,000 steps) |
6 |
|
minimize |
7 |
|
proteolytic cleavage at 3 site(s) |
8 |
|
1 test(s) |
9 |
|
minimize → NVT (2,000 steps) |
10 |
|
water box |
11 |
|
minimize → NVT (2,000 steps) → NPT (16,200 steps, 5 phases) |
12 |
|
equilibration time-series plots → |
13 |
|
basename: |
Note the various syntax used in the graft directives. For example:
graft:
- A_1304:4b7i,C_1-8 # 66
This indicates that the glycan from PDB ID 4b7i chain C, residues 1 to 8, is grafted onto resid 1304 of chain A on the spike. That resid is not the asparagine at position 61; it is the primary NAG attached to Asn61. Residue 1 of chain C of 4b71 is also a primary NAG, so the graft operation aligns the entire glycan such that its primary NAG aligns on the primary NAG already resolved in the spike’s structure. That NAG is deleted and then the glycan from 4b7i is attached directly from the C1 atom of the primary NAG to the ND2 atom of Asn61.
graft:
- D_1-2:4byh,C_1#2-10 # 616
In contrast, this indicates that the glycan from PDB ID 4byh chain C, residues 1 and 2, is grafted onto resid 1 and 2 of chain D on the spike. Chain D happens to be just the two NAGs at Asn 616 on one protomer. The 1#2 notation means to take resid 1 and 2 from chain C of 4byh and use them together as an alignment basis before grafting.
A future release of pestifer will allow for more transparent specification of glycans.
Fully glycosylated BA-2 SARS-CoV-2 Spike protein (PDB ID 7xix) in closed conformation with glycans shown in white licorice. Protein chains are colored uniquely.¶
The prototypical glycans in the same PDB structure but not used here are:
PDB ID 2wah chain D; beta-D-mannopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose
PDB ID 4byh chain D; beta-D-galactopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-2)-alpha-D-mannopyranose-(1-3)-[beta-D-galactopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-2)-alpha-D-mannopyranose-(1-6)]beta-D-mannopyranose-(1-4)-2-acetamido-2-deoxy-beta-D-glucopyranose-(1-4)-[alpha-L-fucopyranose-(1-6)]2-acetamido-2-deoxy-beta-D-glucopyranose
Prototypical glycan from PDB ID 2wah chain D. Green circles denote mannoses, either α or β, and blue circles denote N-acetylglucosamines.¶
Prototypical glycan from PDB ID 4byh chain D. Green circles denote mannoses, either α or β, blue circles denote N-acetylglucosamines, red triangle denotes fucose, and yellow circles denote galactoses.¶