PostEra

Tautomer recognition

The recognition of tautomeric forms doesn’t seem to be working properly. While JAG-UCB-52b62a6f-17 and JAG-UCB-52b62a6f-6 are correctly recognized as (tautomeric) duplicates, DAR-DIA-6a508060-7 does not appear to be recognized as a tautomer of the other two structures.

Thanks for finding this, Peter. It’s quite an annoying one I don’t really have a good fix for. The issue is that RDKit does not recognize the SMILES of DAR-DIA-6a508060-7 as having a 5-membered ring that is aromatic.

Likely resulting from that discrepancy, the two molecules have different inchikeys – which covers many types of tautomerism, but this does not seem to be one of them. This is an interesting read that I pulled up after this came up https://pubs.acs.org/doi/pdf/10.1021/acs.jcim.9b01080