New ChemSpider Functionality at ACS Spring 2010 Part 2 NMR Prediction
Posted by: Antony Williams in Community Building, How ChemSpider Runs, Quality and ContentThe functionality discussed below will be released at the ACS Spring Meeting during the week of March 21st 2010
We had previously released NMR prediction on ChemSpider as announced here. Based on community feedback we later removed that connection and had never reconnected, despite reported improvements. I am an NMR spectroscopist by training …if you check out my Mendeley profile you’ll see that the majority of my papers are NMR-based. Because I am an NMR jock, and despite working in cheminformatics I do keep my hands in NMR research (NMR prediction and computer assisted structure elucidation) I really wanted to make sure that we deliver NMR prediction via ChemSpider. I was involved with the development of the ACD/Labs NMR prediction tools for H1, C13, N15, F19 and P31 nuclei. There are a number of other NMR prediction modules on the market including those of Bio-Rad (in the Know-It-All package), Modgraph and certainly the work of Wolfgang Robien, one of the founding fathers of NMR prediction. These are primarily commercial packages.
In the background we have been working on the introduction of NMR prediction to ChemSpider in time for the ACS. We were looking for a platform that we could integrate that involved community deposition of data to ensure there was a growing database to enhance the prediction algorithms. We also wanted to know that the underlying data quality was good. We wanted to integrate to an Open system that had support from both an active community of participants as well as at least one developer who could provide support if we needed it. All of these criteria point to only one resource, NMRShiftDB. There have been some heated discussions, including on this blog, regarding data quality, especially in NMRShiftDB. However, I co-authored a paper with Chris Steinbeck and colleagues from ACD/Labs validating the dataset as well as ACD/Labs’ NMR prediction approaches.
NMRShiftDB is a high quality data set and certainly contains enough data to provide a training set for NMR prediction algorithms. The NMR predictions provided by NMRShiftDB are used by many people and overall feedback seems to be very positive. Based on our previous knowledge of the data in NMRShiftDB, and the availability of a well defined programming interface to connect ChemSpider, we have worked with Stefan Kuhn at the EBI to produce a first level integration.
As a result at the ACS meeting in San Francisco next week we will roll out NMR prediction integration. In keeping with the new layout model we have adopted for ChemSpider using tabbed approaches for display of data, we have bundled together all predictions. The first ACD/Labs tab provides access to ACD/Labs PhysChem properties, the EPI Summary provides access to the EPISuite and the NMRShiftDB provides access to the predicted NMR spectra. The left spectrum shows the Proton NMR spectrum and the right spectrum shows the C13 NMR spectrum.

When the system is fully integrated the process will work as follows. Since NMRShiftDB already contains many thousands of assigned spectra we will retrieve the experimentally assigned spectra directly and display them. When we cannot retrieve the experimental spectra then we will predict the NMR spectra and display them.
In the future we might pre-predict and store the NMR spectra for all structures on the NMR database. I am a little leery of doing this at present as we need to gather some basic feedback from the ChemSpider users regarding the performance of the NMR prediction algorithms and our existing implementation. In terms of predicting NMR spectra across a database of this size then a lot of consideration has to be given to domain applicability..i.e, what subset of structures should be excluded from having NMR predictions performed? For example, organometallic complexes, free radicals etc. CAS likely had to take this type of issue into account when they applied NMR predictions to their CAS registry.
If there are other NMR prediction algorithms or databases that you would be interested in integrating into ChemSpider please contact me. If you are a cheminformatics vendor selling NMR predictions/databases we would be VERY interested in receiving JUST the structures from your NMR databases. We will deposit them and link directly to your product page as an indicator that you have NMR data available.


It’s one week to ScienceOnline 2010. Last year I missed it because of the threat of weather and this year I’ll likely be 


Following on from my earlier post regarding our interest in aggregating physicochemical data for other groups to use in building their models and algorithms we announce that we are now depositing the data from QSAR world into ChemSpider and pointing back to the original sources on QSAR World. We harvest the SDF files, deposit onto ChemSpider and provide direct links into the original SDF file, with the appropriate titles, so that our users can proceed to gather the data for re-analysis if they find it of interest. An example record is here for 
As an active member of the Wikipedia Chemistry team I continue to be impressed with the dedication and commitment that the members have to improving the quality AND quantity of information available on Wikipedia for chemists. The number of lost hours of sleep freely given to the benefit of Wikipedia, and in this specific case to the chemistry community, is immense. The number of “Compound Pages” on Wikipedia dedicated to drugs/chemicals has continued to grow and, despite a sincere effort on our part to keep everything linked up from ChemSpider to Wikipedia it’s a little like chasing the Road Runner….we’re always behind!
There are three predicted logP values from three different algorithms (ACD/LogP, XlogP and AlogPs) as shown at the top of the figure. There is a predicted value and a database value from the EPISuite from the EPA (middle of the figure) and there is a LogP value from a ![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=355dbddb-a922-41cf-9531-4b42cdd50f66)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=7773ffec-0f06-41be-bdb4-13fb78eff34d)
There is much that can be done to help Wolfram Alpha to have better Chemistry. They have a HARD job ahead of them if they are going to sample the ![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=f3a96d63-748b-4ade-bd20-fdcb4419eb06)

I have set up a ![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=112d731e-8e64-452f-8ab8-072657acc939)

![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=e96f0644-fd80-4980-ab62-519caa7be77b)




![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=c47487aa-a1ca-4195-87eb-cd1b70aa3f8d)
Entries (RSS)