Topic automatically created for discussing the designs at:
https://covid.postera.ai/covid/submissions/f723e322-ea7d-4382-bbcf-496494e8aab2
Hi @mc-robinson. I looked at the github/mc-robinson covalent .csv file and it seems that compounds 2, 3 and 6 were not recognised as covalent ones and were instead put in the non-covalent file. I was just thinking, whether there are other such cases with switched noncovalent/covalent structures and whether there are many false negatives due to this switch-up of incorrect SMILES (cause the SMILES are probably not checked manually).
Hi @nnarvaiss, thanks for digging into this! I was using someone else’s SMARTS strings to flag the warheads, but let me double check their correctness now.
Hi @nnarvaiss, so after looking into the code, here are the SMARTS strings:
warhead_smarts_dict = {
'acrylamide': '[C;H2:1]=[C;H1]C(N)=O',
'acrylamide_adduct': 'NC(C[C:1]S)=O',
'chloroacetamide': 'Cl[C;H2:1]C(N)=O',
'chloroacetamide_adduct': 'S[C:1]C(N)=O',
'vinylsulfonamide': 'NS(=O)([C;H1]=[C;H2:1])=O',
'vinylsulfonamide_adduct': 'NS(=O)(C[C:1]S)=O'
}
Therefore, 2 is not flagged as covalent since that cyano group is not one of the SMARTS groups we flag. Also, that group seems it might be quite reactive. 3 does not match chloroacetamide because there is no Nitrogen. And 6 does not match because both ends of the double bond are substituted.