Copyright©2007 Antony Williams
I love Wikipedia. I use it at least half a dozen times a week…probably more of late. That said I have previously questioned the level of curation of the data on Wikipedia. (2,3) I DO believe that contributors to Wikipedia are making valiant efforts to ensure the quality of the data but I also believe that tools must be developed soon, or processes developed to ensure the quality of the data. Here’s why…
This is the chemical structure of Mupirocin on Wikipedia. Now, if you bothered to redraw that chemical structure in a drawing package showing the molecular mass (like I did) then you would see that it is NOT what is listed in the DrugBox
The structure, molecular formula and molecular mass are shown below taken directly from Free ChemSketch but of course all the drawing packages can do this!
Looking on ChemSpider I found three structures (two are identical but not yet deduplicated – this is presently going on in the background). two are shown below…
Structure 16739332, the top structure, is the correct one while the bottom one is in error. The structure comes from one data source only – Drugbank. Previously for Taxol, Drugbank contained the correct version of the structure. The problem is that ALL of our systems, including ChemSpider, have issues like this….we all have errors and they need curation. Wikipedia is great…the changes were made by me tonight…see here. I added a IUPAC Name, removed the link to Drugbank and updated the molecular mass.
I am committed to assisting in the curating of Wikipedia…many of us are. However, I think there must be a better way and will continue my discussions with the Wikipedia Chemistry Team to get access to all of the chemical compounds on Wikipedia if possible and validate the data in a batch using ChemSpider and associated tools.Stumble it!