PostEra

Hydrogen bond acceptor definitions: potential bug?

You might want to take a look at how the hydrogen bond acceptors are counted. For example HBA for ALP-POS-d2866bdf-1 is 6 in the database. I would count 4 for this molecular structure (N2 and N3 of benzotriazole; amide carbonyl O and dimethylamino N). N1 of the benzotriazole is like amide N and thiophene S would not be expected have significant hydrogen bond basicity. Hope this feedback is helpful.

Thanks @pwkenny, as you may have guessed, this is done automatically through RDKit. It seems the definition of HBA currently used in RDKit is the following (though it looks like it has been changed overtime):

HAcceptorSmarts = Chem.MolFromSmarts('[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),' +
                                     '$([O,S;H0;v2]),$([O,S;-]),$([N;v3;!$(N-*=!@[O,N,P,S])]),' +
                                     '$([nH0,o,s;+0])]')

from https://github.com/rdkit/rdkit/blob/7c6d9cf4e9d95b4daa954f4f094e026093dbc13f/rdkit/Chem/Lipinski.py#L32

which does indeed include the N1 of the benzotriazole and the S of the thiophene. Good to note thought that the number may be more like 4.

FWIW, from a comment in the code https://github.com/rdkit/rdkit/blob/7c6d9cf4e9d95b4daa954f4f094e026093dbc13f/rdkit/Chem/Lipinski.py#L26 it looks like the original Lipinski rules just used Chem.MolFromSmarts('[#7,#8]')

The problem appears to be related to [quote=“mc-robinson, post:2, topic:2057”]
'$([nH0,o,s;+0])]
[/quote] in that nH0 matches both the nitrogen in pyridine (an HB acceptor N) and N-methylpyrrole (not an HB acceptor N). The ring nitrogen in DAR-DIA-eace69ff-36 appears to classed correctly as not being an HB acceptor. Personally, I would not count any oxygen, sulfur or triply-connected nitrogen in an aromatic ring as HB acceptors.

The HB acceptor definitions (N or O) used to apply he rule of 5 (Ro5) are, in my view, overly permissive although thioamide S (a perfectly respectable HB acceptor) is not included. Had more realistic HB acceptor definitions been used, it is highly probable that Ro5 would not have been Ro5.

1 Like