I had previously announced the integration of NMRShiftDB as a beta integration. I have received feedback both on-blog and off-blog about the performance of the algorithms and the need for improved display of results. Wolfgang Robien, one of the major contributors to the domain of curated NMR databases and NMR prediction, gave feedback in the comments section regarding the performance of the initial integration. There was a significant bug highlighted in the integration that resulted in dropping double bonds when passing structures from ChemSpider to the API for NMRShiftDB. This was clearly a very significant issue but Stefan Kuhn has fixed the issue. NMRShiftDB was taken offline for a couple of days while the error persisted but is back online today with this issue resolved.

What we KNOW we need to do to enhance the integration is as follows:

1) Always display the chemical structure and associated numbering scheme

2) Display the spectrum type so it is clear what nucleus is being displayed

3) Indicate whether the spectrum displayed is a database hit or is a predicted spectrum

4) Display the details of the assignments including the number of SHELLS used in the HOSE code based prediction.

Our work on the integration to NMRShiftDB will continue and we will enhance it moving forward. Thanks to all for the ongoing feedback and testing.

Stumble it!

9 Responses to “NMRShiftDB On and Off Again”

  1. Egon Willighagen says:

    Can you elaborate on the double bond problem? Where was the double bond information lost? Or was it just the not taking into account cis/trans?

  2. Antony Williams says:

    Based on my exchanges with Stefan Kuhn the issue was in the CDK. Stefan can probably give you details but I would expect it to be the SMILES parser.

  3. melle B.fouzia says:

    Bonjour,
    je suis doctorante en sciences agronomiques, une partie de mon travail concerne l’identification chimique de deux extraits polyphenolique et huile essentielle d’une plante médicinale (Thymus vulgaris L.) j’ai trois spectre pour chaque extrait (IR, RMN, CG-MS) mais je n’arrive pas a interpréter ces spectres.
    Pour cela je cherche l’aide afin d’identifier la composition chimique des extraits.
    Dans l’attente de votre réponse, veuillez agréer mes respectueuses salutations

  4. Chris Singleton says:

    Is this issue strictly with structures translating and being drawn correctly, or with the algorithms in NMRShift DB?

  5. Antony Williams says:

    It was on the NMRSHiftDB side with one of the intermediate parsers. It has been fixed and we are now looking for further feedback as people use the system.

  6. hko says:

    Further bug predicting cnmr spectra using NMRShiftDB in ChemSpider.
    CSID 776, 773, 1118, 83867, 553949. Lost double bond.

  7. Egon Willighagen says:

    Antony, I looked at the CSID 553949 (thanx to hko for posting CSIDs!); it looks to me like a problem with the NMRShiftDB incorrectly accounting for the missing information in the SMILES (“c1cccc2c1c(c(\C=C)n2)CCN”): the latter does not define all bond orders; moreover, the CDK currently does not have a SINGLE_OR_DOUBLE bond order to match the SMILES for that compound (and, no, not all bonds between to ‘c’ atoms in SMILES are really aromatic; if it only was that easy).

    The NMRShiftDB is then left to guess where the double bonds are, which is tricky, and the CDK does not have a universal solution. I do not know what solution the NMRShiftDB is using, but please do consider having ChemSpider send SMILES *with* explicit bond orders.

    Say what you want; say what you mean. Semantic chemistry starts with being explicit with your data, and not with wrapping in CML or RDF :)

  8. Wolfgang Robien says:

    Dear Stefan;

    benzene (Compound #236) gives now 1.45 ppm for 1H and 27.1ppm for C13 (Mon, May 3rd 2010m 7:26am MET; screendump available on request), obviously again lost double bonds.

    1) Please keep in mind that ‘Chemspider’ and Tony Williams have established a lot of reputation within the scientific community – this is the best way to destroy it !
    2) Open Software and Open Data doesnt mean that every program is untested
    3) I have detected this error a long time ago and have written a private email to Tony, because I think this topic is not suitable for my webpage located on http://nmrpredict.orc.univie.ac.at/chemspider_nmrshiftdb.html; on this webpage only problems showing that NMRSHIFTDB is state-of-the-art of ca. 1980 are summarized.
    4) When you know that the problem is connected to some parser-error dealing with double bonds then YOUR RESPONSIBILITY IS TO TEST THAT BEFORE (!!!!!!!) you make it available to the community. I think it is NOT an obscene desire, that a program author of a prediction program has tested his/her program with benzene. Maybe a second test with a heterocycle like imidazole can be expected by the community – maybe only when your ‘Open Time’ allows that !

    Recommendation for Chemspider users: Do a prediction on Chemspider, then take a textbook and apply the ‘old’ increment rules, if you are still unsure apply another prediction program like ACD, KnowItAll, NMRPredict, SPECINFO …….. be careful, those programs produce reliable values, but they are not free-of-charge ! Hopefully, you understand now, why those programs are NOT FREE-OF-CHARGE ! They are tested und the underlying data are curated !

    Have a nice day ! Wolfgang Robien

  9. Markus Sitzmann says:

    Egon,

    Maybe this helps for the “SMILES bond order problem” at least temporarily (setting the usearo flag to false is important):

    http://cactus.nci.nih.gov/chemical/structure/c1cccc2c1c(c(\C=C)n2)CCN/smiles?usearo=false
    NCCC1=C([NH]C2=CC=CC=C12)C=C

    http://cactus.nci.nih.gov/chemical/structure/c1ccccc1/smiles?usearo=false
    C1=CC=CC=C1

Leave a Reply