When working on the Wikipedia curation project for validating chemical structures we had to consider the best way by which to identify the articles that we could finally connect to via ChemSpider. There were a few ways to do it and some are listed below.

1) Identify articles only with ChemBoxes or Drugboxes and integrate to them only

2) Identify articles with structures already in place and connect to them

3) Identify articles with “relevant information” associated with a chemical compound(s) and integrate to them

When we chose to integrate to Wikipedia by depositing the header of the articles onto ChemSpider with the link back to Wikipedia we did from a “structure-centric” view. So, on the first pass we steered away from articles where we could not appropriately represent the chemical structures with our present structure handling capabilities – polymers and organometallics especially were excluded but the same is true for many of the inorganics.

We actually chose a combination of 1-3. By not being limited to ChemBoxes/DrugBoxes we also got to pick up on articles such as this one with Spinosad (no ChemBox) and even things as obscure as these TWO definitions for Valium, one connected to CANA outlined below.

“CANA is a United States military acronym for “Convulsive Antidote, Nerve Agent”, the drug diazepam in injectable form. One CANA kit is typically issued to U.S. service members along with three Mark I NAAK kits when operating in circumstances where chemical weapons in the form of nerve agents are considered a potential hazard. (Both of these kits deliver drugs using auto-injectors. They are intended for use in “buddy aid” or “self aid” administration of the drugs in the field prior to decontamination and delivery of the patient to definitive medical care.) Read more… or Edit at Wikipedia…”
Our integration to Wikipedia is not complete. What’s happening at present is:
1) Finish curation of the WIkipedia organic structures and associated CAS Numbers in collaboration with CAS
2) Add inorganics and elements to the system that we can support with present structure support in our system
3) Enhance the system to support organometallics and “images” of compounds/materials rather than just structures
We will rollout a new piece of functionality shortly to make crowdsourced integration with Wikipedia even easier for you, our users. Watch this space.

