Are students at risk using ChemSpider? It seems so based on recent commentary by Peter Murray-Rust. Peter has done us the service of driving ChemSpider from the point of view of someone interested in inorganic and organometallic complexes. The majority of users are performing either text based or structure/substructure searches based on organic molecules and their feedback is mostly congratulatory. It is excellent to receive feedback on that area of chemistry we suspect would be very challenging – inorganics and organometallics. I believe that we all struggle with these types of compounds and have therefore compared with the two other databases of note with over 5 million compounds – PubChem and eMolecules

The power of curation is clear on this blog where Peter has again identified some issues with ChemSpider’s treatment of certain compounds. I have addressed two other situations previously (1,2).

Peter identified an issue with the display of sodium hydride. We have NOT manually examined 10.6 million records so were not aware of this bug. The Sodium Hydride record is now curated with Peter’s comments and the display bug is now fixed. THIS is the power of community feedback. It will take some time to repopulate the images across 10 million records though. By comparison, a search of eMolecules produces no hits. A search of PubChem produces a number of hits, one containing a sodium ion and a hydride ion, bonded by a dative bond.



Peter also identified issues with Prussian Blue as excerpted below “… the chemical formula has been represented as separated iron ions and cyanide ions.” The Prussian blue record is now also curated with Peter’s comments. These complexes are challenging for all us…so warn your students! The record in question for ChemSpider is here, for PubChem are here and for Emolecules is here. Look at the display for PubChem 182606 as an example of the challenge.Prussian Blue on PubChem












Also, check eMolecules display. If you search eMolecules for Prussian Blue you will find 3 results. Check each of them. Here’s an example. Notice any issues?

Prussian Blue on Emolecules


The conversion of search structures via SDF files as well as the display of such compounds is challenging for all of us! The work has already been done this evening to deal with the dative bonds and coordination bonds in such complexes and these structures will be updated in the near future.

While searching millions of organic molecules is not easy the truth is it is more challenging for organometallics and we are conscious there would be issues here. I judge there to be two organizations with the ability to handle these complex molecules appropriately. One is CAS and the other is the Cambridge Crystallographic Database. Certainly it remains a challenge for us, as well as others. In theory this will be addressed well in CrystalEye and when these data are made available we will work with the group to determine a path to migrate such complex structures via SDF if possible. This will likely be done if they are to be deposited in PubChem. InChIs are not the solution since as identified at the InChiFAQ it does not support complex organometallics.

Are students at risk using ChemSpider? There have been recent reports about errors on Wikipedia and whether or not Wikipedia should be trusted. I know people working hard on populating Wikipedia and they are passionate individuals attempting to give back to the community. ChemSpider has already challenged the statement about Calcium Carbonate solubility on this blog but on Wikipedia it states it is insoluble but in the same page discusses the solubility of calcium carbonate (this might be because there is a Wikipedia accepted definition of insoluble). The ChemSpider team is also working hard and are passionate about what we are doing. What we need is continuing feedback. The best warning we can give at present is ChemSpider is beta. But, it is here to stay and we are working on all reported bugs in an appropriate order. As with all other large database resources students should take caution. We are all imperfect.

We are very grateful to Peter for his ongoing feedback regarding ChemSpider. So much so that we have voted Peter our “Tester of the Month”. The feedback is welcome. We’ve already fixed all the bugs…publishing the update to >10 millions structures will take time though.

Stumble it!

2 Responses to “Is ChemSpider Dangerous for Students?”

  1. David Bradley says:

    Wikipedia is an invaluable resource for general use and for confirming facts one may already know. Date of the Battle of Hastings. Names of the Pilgrim Fathers. That kind of thing. 99.999% of editors and journalists will tell you it is not a valid journalistic source and several editors with whom I work will not accept Wikipedia citations as valid references.

    It can be unreliable. In a system where anyone can edit any document or change an image, even if there is a paper trail to follow, one simply cannot rely on the “facts” presented without having at least two other sources to confirm said facts. And, if you have two other sources, why cite Wikipedia at all?

    Wikipedia is obviously very useful, but it, like the structure and solubility of calcium carbonate, can be fluxional, and can only act as a first stop in information gathering. It does not represent the definitive facts, unless you already know them to be true.


  2. ChemSpider Blog » Blog Archive » Zen and the Art of Chemical Structure Databases. What is the definition of “Quality”? says:

    [...] poorly educating the students is an issue and certainly this concern has been raised recently in regards to the ChemSpider system. The question this leads to is about Quality. The Quality of [...]

Leave a Reply