Central point for Scoring and Docking

@londonir @mc-robinson
Here is the placeholder where you can upload the starting list for all compound scoring and docking.
List was prepared by @londonir with the following instructions:
Hi all,

I attach the two requested smiles files:

all.smi (135.1 KB) - the converted compounds

SiH3.smi (152.6 KB) - the converted compounds with [SiH3] indicating the covalent attachment point

The IDs are based on the original unique IDs with the following new suffix:

  • Original acrylamides from the designs were retained with suffix: ACR1

  • Chloroacetamides turned to acrylamides: ACR2

  • Chloroacetamides turned to nitriles: NIT2

  • Acetyl groups turned into acrylamides: ACR3

  • Acetyl groups turned into nitriles: NIT3

Here’s an update on my scoring: (3.3 MB)

For some reason, my constrained docking still results in some poses ending up with the warhead on the wrong side, and I’m not sure why—these poses should just be rejected.

The MD simulations examining the covalent warhead distance will be done in the morning, since I wasted time trying to figure out why the constrained docking was giving these surprising results.

Many well-ranked noncovalent complexes from (covalent_warhead_df-docked.sdf) do place the covalent warhead in a sensible geometry. These might be easily picked out by quick manual inspection of the top docking hits.

I’ve also sorted by scored overlap with the user-identified inspiration fragments at the suggestion of @frankvondelft (covalent_warhead_df-docked-overlap.sdf) and sometimes the geometries (see covalent_warhead_df-docked-overlap/) resemble existing fragments well,

but it’s hard to see good overlaps with multiple distinct fragments (as in @alphalee’s analysis) emerge organically in the top-ranked ligands. I probably need some other numerical criteria to figure that out.

Thanks @JohnChodera (also for apparently pulling another late-nighter!).

The list that John posted is the OLD list - please only look at it to give him feedback on whether the conformation generation is useful.

The sensible geometries do look useful John, equally I like the overlap with distinct fragments, I appreciate all this is quite cutting edge - but I’d rather have this than just my medicinal chemistry biases. Even without a sortable score, if I can cut the list down with other means (crude size and lipophilicity will do) then I can eyeball docked structures.

@AnthonyA @londonir There seem to be a couple of minor problems in all.smi:

(1) Duplicate molecules (line numbers prepended):

42 C=CC(=O)N1CCN(c2ccc(C3NC(=O)c4ccccc34)cc2S(=O)(=O)N)CC1 DR_ACR1
43 C=CC(=O)N1CCN(c2ccc(C3NC(=O)c4ccccc34)cc2S(=O)(=O)N)CC1 DR_ACR1

(2) Different molecules with the same name:

1507 N#CCN1CCN(CCNC(=O)c2ccccc2)CC1 STE-KU_NIT2
1508 N#CCN1CCN(CCNC(=O)c2cccc3ccccc23)CC1 STE-KU_NIT2
1509 N#CCN1CCN(CCNC(=O)c2cccnc2)CC1 STE-KU_NIT2
1510 N#CCN1CCCC(n2cc(-c3ccccc3)nn2)C1 STE-KU_NIT2
1511 N#CCN1CCCC(n2cc(-c3ccc(Cl)cc3)nn2)C1 STE-KU_NIT2
1512 N#CCN1CCCC(n2cc(COCc3ccccc3)nn2)C1 STE-KU_NIT2
1513 N#CCN1CCN(Cc2cn(Cc3ccccc3)nn2)CC1 STE-KU_NIT2


This raises another point: designers might have made interesting merges of compounds by carefully looking at the angles, but not paid attention to the physico-chemical properties of the fragments they used for the merge, ending up with a greasy ball. The merge could be a smart one, but further tuning of the compound might be required, by an experienced medchemist. It feels like@JohnChodera’s overlap should be the first filter, and that a medchemist should review the list and maybe tune the merges that look the most promising to make sure that the compounds have good properties.

Hi @londonir, @JohnChodera, and others, @HollyFoster just sent over a nicely filtered list of the covalent compounds over here, Updated list of all submissions , might be useful for you all when considering large-scale filtering of docking results

Hi, I had a question regarding basic amines.

I have seen that some compounds are submitted with the basic amines protonated (they will be at physiological pH), others (including my submissions) are not. Before docking, are they all converted to the protonated form or can twice the same compound give entirely different docking results. There are a lot of piperazines submitted so this could have quite an impact on the docking outcomes.

1 Like

I have completed the scoring of the covalent compounds follow-ups with strong constraints to a starting pose that is as faithful to the fragment hits
EDIT UPDATED LIST: fragmenstein_notes_v3.xlsx (286.8 KB)

As discussed in the meeting, to make the starting pose match as faithfully to the hits, a collage of atom positions was made, a “Fragmentstein monster”. More details can be found at my github repo. If anyone wants the compounds for starting poses using other tools they can be found here.
Some look really good like this (often less than 3 starting hits):

while others are less aesthetically pleasing:

The compound was not docked but was topologically minimised in the protein and scored (min1) and only subsequently it was docked like normal and scored (docked). Models can be found here.


An addendum to my previous post. The models and the data were incomplete, but I have fixed that. But only 870 compounds were able to be minimised. I will investigate into the why tomorrow or the Tuesday and make a compound only folder of mol files (with corrected bond order) in the in-protein minimised positions. Sorry about that!

Additionally, I made a shoddy/hacky table in Michelanglo (an interactive descriptions of protein sharing site) to display the table to check out the results:

Hi @JohnChodera , are all submissions automatically protonated before docking? Thanks!

Will depend on the other groups on the piperazines? If it’s an amide, urea, sulphonamide linkage coming off a piperazine N, it won’t be an issue but an H, alkyl/aryl group would have the N protonated.
if the piperazine ring is C-fluorinated you might find it’s weakly basic too and not protonated.

Has anyone considered epoxyketone warheads used in (Thr-targetting) proteasome inhibs?

Continuous Process Improvement in the Manufacture of Carfilzomib, Part 1: Process Understanding and Improvements in the Commercial Route to Prepare the Epoxyketone Warhead

  • Peter K. Dornan *****
  • Travis Anthoine
  • Matthew G. Beaver
  • Guilong Charles Cheng
  • Dawn E. Cohen
  • Sheng Cui
  • William E. Lake
  • Neil F. Langille
  • Susan P. Lucas
  • Jenil Patel
  • William Powazinik IV
  • Scott W. Roberts
  • Chris Scardino
  • John L. Tucker
  • Simone Spada
  • Alicia Zeng
  • Shawn D. Walker

Cite this: Org. Process Res. Dev. 2020, 24, 4, 481-489

Publication Date:April 2, 2020

Copyright © 2020 American Chemical Society


Hi @JHullaert! Apologies for the delay in getting back to you.
We use the OpenEye toolkit to enumerate all reasonable protonation states using oeproton.OEGetReasonableTautomers, and then assign protonation states appropriate to pH ~7 in solution. All the code to do the docking is here.

Note that this has limitations, in that it doesn’t account for the influence of the electrostatic environment of the protein on protonation states, and it does not make an attempt to penalize the the binding of different tautomeric/protonation states.

It does mean that different submissions that correspond to different protonation states or tautomers of the same compound should give identical results.

Hi John, thank you for your respons.

Seems to be one of the many challenges a computational chemist has to deal with.

Our lab is working on automated sampling of protonation/tautomeric state Monte Carlo sampling within our MD-based free energy calculations, but due to technical constraints, this is still a little ways off.