Archive for October 27th, 2009

JC has given a great overview of how students might want to use ChemSpider for the purpose of chemical information retrieval on the internet. JC’s course lecture thoroughly exercises ChemSpider, in real time, to do searches across the internet. He posted his seminar to Scivee here and I have embedded the lecture below. It’s a good talk for students and I encourage you to share it and review how ChemSpider can be used in your classwork and in your laboratories.

QSAR worldFollowing on from my earlier post regarding our interest in aggregating physicochemical data for other groups to use in building their models and algorithms we announce that we are now depositing the data from QSAR world into ChemSpider and pointing back to the original sources on QSAR World. We harvest the SDF files, deposit onto ChemSpider and provide direct links into the original SDF file, with the appropriate titles, so that our users can proceed to gather the data for re-analysis if they find it of interest. An example record is here for Atovaquone where we list the links to data residing on QSAR world for download. The links can be seen under the supplemental information section as shown below where you can see links to seven different types of data. We have chosen, for the time being, to not deposit the values associated with these data onto ChemSpider as the data are very heterogeneous in representation even though they are all delivered as SDF files.

supplemental information

What’s your favorite flavor of mercury acetate..on Wikipedia here? on CAS Common Chemistry here or on ChemSpider here?

How would you represent this structure if you were to draw it as a 2D diagram?

mercury acetate

roadrunnerAs an active member of the Wikipedia Chemistry team I continue to be impressed with the dedication and commitment that the members have to improving the quality AND quantity of information available on Wikipedia for chemists. The number of lost hours of sleep freely given to the benefit of Wikipedia, and in this specific case to the chemistry community, is immense. The number of “Compound Pages” on Wikipedia dedicated to drugs/chemicals has continued to grow and, despite a sincere effort on our part to keep everything linked up from ChemSpider to Wikipedia it’s a little like chasing the Road Runner….we’re always behind!

We have been working with the WikiChem team of late to embed links from Wikipedia back to ChemSpider. I am humbled to know that our hard work to establish ChemSpider as a source of quality information has reached a level of trust such that Wikipedia now links from the ChemBoxes out to ChemSpider. The links are being updated on an on going basis at present with hundreds of new links already established and more being generated on an ongoing basis. Wikipedia User: Beetstra has written a ‘bot that is inserting ChemSpiderIDs across the database (see below) and we ARE doing rigorous checking of all of the links.This was using a file that we generated on our side showing links to Wikipedia from ChemSpider.

beetstra

We will then be able to generate a list of all ChemBoxes/DrugBoxes without links from Wikipedia to ChemSpider and we will then make the links on our side, manually curating the structures, and then hand back a file to finish all linking. At this point we will have the backfile under control and we can perform ongoing updates as new compound pages are created on ChemSpider and, if we curate and find errors on Wikipedia or ChemSpider making a few manual edits is easy.

There are very dedicated teams on Wikipedia and ChemSpider carefully poring over data with their robots and eyeballs to create a linked data set of quality chemistry. It’s long, tedious AND important work. When its done we will have an expanded set of data to semantically link from RSC articles when we do markup.