The functionality discussed below will be released at the ACS Spring Meeting during the week of March 21st 2010

We had previously released NMR prediction on ChemSpider as announced here. Based on community feedback we later removed that connection and had never reconnected, despite reported improvements. I am an NMR spectroscopist by training …if you check out my Mendeley profile you’ll see that the majority of my papers are NMR-based. Because I am an NMR jock, and despite working in cheminformatics I do keep my hands in NMR research (NMR prediction and computer assisted structure elucidation) I really wanted to make sure that we deliver NMR prediction via ChemSpider. I was involved with the development of the ACD/Labs NMR prediction tools for H1, C13, N15, F19 and P31 nuclei. There are a number of other NMR prediction modules on the market including those of Bio-Rad (in the Know-It-All package), Modgraph and certainly the work of Wolfgang Robien, one of the founding fathers of NMR prediction. These are primarily commercial packages.

In the background we have been working on the introduction of NMR prediction to ChemSpider in time for the ACS. We were looking for a platform that we could integrate that involved community deposition of data to ensure there was a growing database to enhance the prediction algorithms. We also wanted to know that the underlying data quality was good. We wanted to integrate to an Open system that had support from both an active community of participants as well as at least one developer who could provide support if we needed it. All of these criteria point to only one resource, NMRShiftDB. There have been some heated discussions, including on this blog, regarding data quality, especially in NMRShiftDB. However, I co-authored a paper with Chris Steinbeck and colleagues from ACD/Labs validating the dataset as well as ACD/Labs’ NMR prediction approaches.

NMRShiftDB is a high quality data set and certainly contains enough data to provide a training set for NMR prediction algorithms. The NMR predictions provided by NMRShiftDB are used by many people and overall feedback seems to be very positive.  Based on our previous knowledge of the data in NMRShiftDB, and the availability of a well defined programming interface to connect ChemSpider, we have worked with Stefan Kuhn at the EBI to produce a first level integration.

As a result at the ACS meeting in San Francisco next week we will roll out NMR prediction integration. In keeping with the new layout model we have adopted for ChemSpider using tabbed approaches for display of data, we have bundled together all predictions. The first ACD/Labs tab provides access to ACD/Labs PhysChem properties, the EPI Summary provides access to the EPISuite and the NMRShiftDB provides access to the predicted NMR spectra. The left spectrum shows the Proton NMR spectrum and the right spectrum shows the C13 NMR spectrum.

NMRshiftDB

When the system is fully integrated the process will work as follows. Since NMRShiftDB already contains many thousands of assigned spectra we will retrieve the experimentally assigned spectra directly and display them. When we cannot retrieve the experimental spectra then we will predict the NMR spectra and display them.

In the future we might pre-predict and store the NMR spectra for all structures on the NMR database. I am a little leery of doing this at present as we need to gather some basic feedback from the ChemSpider users regarding the performance of the NMR prediction algorithms and our existing implementation. In terms of predicting NMR spectra across a database of this size then a lot of consideration has to be given to domain applicability..i.e, what subset of structures should be excluded from having NMR predictions performed? For example, organometallic complexes, free radicals etc. CAS likely had to take this type of issue into account when they applied NMR predictions to their CAS registry.

If there are other NMR prediction algorithms or databases that you would be interested in integrating into ChemSpider please contact me. If you are a cheminformatics vendor selling NMR predictions/databases we would be VERY interested in receiving JUST the structures from your NMR databases. We will deposit them and link directly to your product page as an indicator that you have NMR data available.

Stumble it!

4 Responses to “New ChemSpider Functionality at ACS Spring 2010 Part 2 NMR Prediction”

  1. Wolfgang Robien says:

    At http://nmrpredict.orc.univie.ac.at/case/propose.php you find a structure dereplication engine based on approx. 20,000,000 compounds from PUBCHEM. Spectra have been predicted using the CSEARCH NN-approach leading to approx. 3,000,000,000 structure-spectra pairs ( for details see http://nmrpredict.orc.univie.ac.at/csearch_summary/strpro.html ) – best search time is ca. 0.4 sec / average search time tested on 500,000 examples is below 1 second. Isn’t that bad for 250 GB of data ;-) ) Result is linked to the INCHIKEY-pages ( see http://nmrpredict.orc.univie.ac.at/csearchlite/inchikey.htm ) – you immediately know that there are experimental data in either NMRPREDICT, SPECINFO, KNOWITALL or CSEARCH ( including upcoming data not yet released ) available.

    If you prefer a system able to rank the hitlist, then you are right at http://nmrpredict.orc.univie.ac.at/identify/ ; this system is based an some 16 millions of structures again from the PUBCHEM-collection. Search time here is about 5-8 seconds.

    Access is free, if you like those systems drop me a line ! Wolfgang Robien

  2. Wolfgang Robien says:

    Its great to have NMR-Prediction online – there are 2 questions: 1) Whats the price 2) Whats the quality

    ad 1) Its free – GREAT !

    ad 2) Thats easy to answer, lets do a few tests:

    a) Use a simple chlorinated compound e.g. 6-chloro-2-methyl-hex-2-ene
    see here: http://www.chemspider.com/Chemical-Structure.14696229.html
    Call this entry, display the C-NMR Spectrum ( I got 131.7/127.8/43.9/2×29.6/2×20.3 ppm) – the two -ch2- are very similar, but the cis- and trans- methyls differ usually by about 8ppm, they are given as 2×20.3 ppm

    Lets check how other systems perform – lets test CSEARCH:

    http://nmrpredict.orc.univie.ac.at/c13robot/robot.php

    draw the structure, fill out the data, fill into the first box holding the lines a value of 399.0 – this will cause the program to do a spectrum prediction – thats it, now submit it ….. you will get the results back soon to your mailbox. (requirement is, that your email has been registered IN ADVANCE !)

    b) you can repeat this experiment with any terminal di-methyl-’ene’ —> you will learn that NMRShiftDB cant handle the most simple type of cis/trans-isomerism. (except the identical compound is in its database, but then it is no prediction furthermore !)

    In order to remove the decision between FREE-OF-CHARGE and scientifically POOR ( state-of-the-art of ca. 1978 ! ) versus NOT FREE-OF-CHARGE, but STATE-OF-THE-ART 2010 the service at http://nmrpredict.orc.univie.ac.at/c13robot/robot.php has been installed – 3 predictions per day are available. I reserve the right to remove this service whenever I want. Email-registration IN ADVANCE necessary !

    Stay tuned, more examples will come ! wolfgang.robien(at)univie.ac.at

  3. hko says:

    To Wolfgang: Both cited applications are dereplication tools which need known (!) nmr shifts. However in my opinion, the intention is to deliver known or predicted nmr shifts. Therefore it would be at least useful for chemspider users to get information with respect to the selected structure, if there exist known nmr data available in one of your databases.

  4. wolfgang robien says:

    To HKO: You are right – I think your statement is with respect to my first comment and I assume you have posted it before you have seen my second comment ! The second comment deals exclusively with spectrum prediction.

    Your proposal to get knowledge about existing electronic spectral data:

    see http://nmrpredict.orc.univie.ac.at/csearchlite/inchikey.htm
    This was announced on November 19th, 2007 (nearly 2,5 years ago !!!!!!) – more to come – stay tuned!

    Wolfgang

Leave a Reply