Copyright©2008 Antony Williams
When working on the Wikipedia curation project for validating chemical structures we had to consider the best way by which to identify the articles that we could finally connect to via ChemSpider. There were a few ways to do it and some are listed below.
1) Identify articles only with ChemBoxes or Drugboxes and integrate to them only
2) Identify articles with structures already in place and connect to them
3) Identify articles with “relevant information” associated with a chemical compound(s) and integrate to them
When we chose to integrate to Wikipedia by depositing the header of the articles onto ChemSpider with the link back to Wikipedia we did from a “structure-centric” view. So, on the first pass we steered away from articles where we could not appropriately represent the chemical structures with our present structure handling capabilities – polymers and organometallics especially were excluded but the same is true for many of the inorganics.
We actually chose a combination of 1-3. By not being limited to ChemBoxes/DrugBoxes we also got to pick up on articles such as this one with Spinosad (no ChemBox) and even things as obscure as these TWO definitions for Valium, one connected to CANA outlined below.