Archive for the WiChempedia Category

roadrunnerAs an active member of the Wikipedia Chemistry team I continue to be impressed with the dedication and commitment that the members have to improving the quality AND quantity of information available on Wikipedia for chemists. The number of lost hours of sleep freely given to the benefit of Wikipedia, and in this specific case to the chemistry community, is immense. The number of “Compound Pages” on Wikipedia dedicated to drugs/chemicals has continued to grow and, despite a sincere effort on our part to keep everything linked up from ChemSpider to Wikipedia it’s a little like chasing the Road Runner….we’re always behind!

We have been working with the WikiChem team of late to embed links from Wikipedia back to ChemSpider. I am humbled to know that our hard work to establish ChemSpider as a source of quality information has reached a level of trust such that Wikipedia now links from the ChemBoxes out to ChemSpider. The links are being updated on an on going basis at present with hundreds of new links already established and more being generated on an ongoing basis. Wikipedia User: Beetstra has written a ‘bot that is inserting ChemSpiderIDs across the database (see below) and we ARE doing rigorous checking of all of the links.This was using a file that we generated on our side showing links to Wikipedia from ChemSpider.

beetstra

We will then be able to generate a list of all ChemBoxes/DrugBoxes without links from Wikipedia to ChemSpider and we will then make the links on our side, manually curating the structures, and then hand back a file to finish all linking. At this point we will have the backfile under control and we can perform ongoing updates as new compound pages are created on ChemSpider and, if we curate and find errors on Wikipedia or ChemSpider making a few manual edits is easy.

There are very dedicated teams on Wikipedia and ChemSpider carefully poring over data with their robots and eyeballs to create a linked data set of quality chemistry. It’s long, tedious AND important work. When its done we will have an expanded set of data to semantically link from RSC articles when we do markup.

Caption text
Image via Wikipedia

When is a “Free Medical Encyclopedia” not what’s advertised? When it’s Wikipedia…

I am seeing a lot of ChemSpider links showing up in various places but today I happend across links from a so-called “Free Medical Encyclopedia”. A close examination shows that it’s nothing more, as far as I can tell, than Wikipedia articles. Look at this article on Tramadol on the encyclopedia and compare with the article on Wikipedia here. There is no recognition of Wikipedia that I can see…rather inappropriate.

Reblog this post [with Zemanta]

I’ve caught wind of some growing confusion in the world of “Chemical ‘Pedias” relative to Chemistry on Wikipedia. I think we might have added to the confusion so I want to clear it up here.

We originally released WiChempedia in April of this year as announced here. What is it? It is a subset of the ChemSpider database made up of structure-based records on Wikipedia. So, when you visit www.wichempedia.org what you will see is a redirect to wikipedia.chemspider.com, the wikipedia subset.  Any of the records under this subset are linked to Wikipedia Articles. For example, for this record you will see:

Wikipedia Article(s)

Quinacrine (trade name: Atabrine) is a drug with a number of different medical applications being initially used in the 1930s as an antimalarial drug. It has also been used as an antibiotic in the treatment of Giardiasis (an intestinal parasite), and in research as an inhibitor of phospholipase A2. It has also been proposed for use in systemic lupus erythematosus. Read more… or Edit at Wikipedia…
Notice that this is linked out to Wikipedia for you to read the entire article and that it is even possible to edit the article at Wikipedia. We do not grab the entire article for a compound. We grab only the beginning of the article and display this with a link to the original article. This dramatically reduces the work we would have to do if we hosted all of the Wikipedia Chemistry articles since we would need to stay updated with changes to all of the articles. Too much work. What does Wichempedia offer to chemists and to wikipedians interested in Chemistry? It offers structure and substructure searching of Wikipedia and access to a LOT more supporting information. For example, for this record you can see publications, spectra, safety/tox information etc. We are expanding on information for Wikipedians not just showing Wikipedia records again.
Another online resource tapping into Wikipedia Chemistry is Chempedia from Rich Apodaca. This is not to be confused by the OTHER ChemPedia. (People ask why we use weird names like ChemSPider and ChemMantis – try finding something NOT claimed on the web already!) Rich has taken a similar approach to accessing the Wikipedia monographs for display as detailed here. To use Chempedia is simple…a google like entry page where you enter a name. Entering Quinacrine provides the same Wikipedia text as on ChemSpider (it’s a short article), an image of the structure and InChI, and Mw. A comparison between the ChemBox on WIkipedia and Chempedia is shown below.
Rich_has_done_a lot more work on Chempedia to integrate into information on Wikipedia than we have done with ChemSpider. For Quinacrine for example Rich includes this information about latest edits and who edited.
7 edits since May 19, 2008. Last edited Jul 10, 2008 by Lightbot (31).
Rich is working on structure and substructure searching at present I believe.
There is confusion, I believe, about both Wichempedia, our own approach, Chempedia and Wikipedia chemistry. I saw this this week confirming my belief. The way this reads over 3000 people have contributed to Chempedia since Nov 2008. I was interested if this was true.
Chempedia
Chempedia is different from most other chemical databases in that its textual content is created and updated in real-time by a large and diverse community of volunteers worldwide through Wikipedia. This means every one of Chempedia’s compound monographs can be changed and adapted by you. And if you find a Compound Monograph is missing from Chempedia, you can create it and make it available for others to use. More then 3200 have contributed to this Wiki site as of Nov. 2008.
http://chempedia.com/
I clicked on the contributors link for Chempedia and saw that there were indeed 3207 contributors. However, just looking at page 1 we see that for MOST of the people listed they are no listed monographs and no contributions. In fact for the first 10 people listed there were 4 contributions. Maybe there is some historical issue here? Maybe current contributions only includes in this year. Not sure. There are pages where there is only one structure.
I was interested to see whether I was listed as a contributor since I have contributed to Wikipedia but not to Chempedia directly. As I clean data on ChemSpider I’ll make edits to Wikipedia.There is benefit to moving information from both Chempedia and ChemSpider back to Wikipedia and Rich Apodaca and I both contribute to Wikipedia.It says on the contributors list that I have contributed 15 times. I’m not sure what that means, maybe in terms of contributions to ChemBoxes, but I have left a lot of comments on Wikipedia. I think contributions must be edits..not sure.
Relative to the comments “This means every one of Chempedia’s compound monographs can be changed and adapted by you. And if you find a Compound Monograph is missing from Chempedia, you can create it and make it available for others to use. ” ChemPedia directs the user to Wikipedia to write an article and then links to it. The process is simple. When a search is done if an article doesn’t exist you get the response:

Suggestions:

  • Re-check your CAS number, monograph title, PubChem CID, or structure.
  • Remove keywords. Chempedia does not yet perform keyword searches.
  • If your article doesn’t exist on Wikipedia, create it. You can then add it to Chempedia.
if you click on create and go to Wikipedia to write the article. Then you link it back to Chempedia here.
This is great…users are directed to help Wikipedia and everyone wins. When the article is written ChemSpider will pick up the llink too and we’ll all be integrated. We haven’t introduced that onto ChemSpider..it’s a good idea though. Should we?
All is actually made clear here on the About ChemPedia page…”Chempedia is different from most other chemical databases in that its textual content is created and updated in real-time by a large and diverse community of volunteers worldwide through Wikipedia. This means every one of Chempedia’s compound monographs can be changed and adapted by you. And if you find a Compound Monograph is missing from Chempedia, you can create it and make it available for others to use. “
There are some examples of structures on ChemPedia not on Wikipedia yet (see below…is that the correct structure? It came from PubChem but I don’t know) and the same situation is true for ChemSpider.Eventually we will have systems in place to exchange such information on the fly.
Chempedia and Wichempedia are serving a valuable purpose. We are both dependent on the contributors to Wikipedia and are indebted to them!
Reblog this post [with Zemanta]

There has been a conversation going on over on Wikipedia about supporting ChemSpider IDs in the ChemBox and DrugBox. ChemSpider IDs have been added to ChemBoxes over the past few weeks by a number of contributors and, based on the blessing of members of the Wikipedia community, they will now be displayed in Drugboxes also. The conclusion of the conversation today stated:

 Done Thanks everyone – that seems clarification that people would find this helpful and, in particular, thanks for addressing ChemSpiderMan own reservation. I’ve added to {{drugbox}}, eg see Verapamil.David Ruben Talk 13:10, 22 September 2008 (UTC) 

A Drugbox, for Xanax, is shown below. Note the number of outlinks to PubChem, Drugbank and now ChemSpider. 

 

What I am most proud of is some of the statements made in the discussion that validate our efforts to create high quality curated source of information. For example:

“I’d just like to add my voice to those that find value in linking Wikipedia articles to ChemSpider.  find this database to be reliable and information-rich in comparison to the other dabases we link to already. I support adding a link from drugboxes and chemboxes. – Ed (Edgar181) 11:36, 22 September 2008 (UTC)”

“I think the effect of linking to ChemSpider would be to marry a well curated database (ChemSpider) with monographs (WP). To elaborate, the database contain various intrinsic properties (MW, isotopic composition, structure, stereo), experimentally-determined properties (bp/mp/appearance), experimentally-determined spectra (1H/13C NMR, IR, etc., e.g. [1]), apart from predicted data. Monographs: our articles discussing the synth, applications, chemistry, etc. of various compounds, drug-, drug-like, or otherwise. Seems like everything to gain and not much to lose, except for another entry in the drugbox and perhaps concerns of table creep. –Rifleman 82 (talk) 03:46, 22 September 2008 (UTC)”

“I personally support the addition of ChemSpider not because of the predicted properties—which are included in PubChem—, but because, so far, ChemSpider appears to be highly curated (and transparently so). PubChem has some serious if relatively infrequent reliability issues, which are well known to the WP chemistry/pharm community, and MeSH (to which CAS numbers in the Drugbox link) appears to lack information on many compounds. Fvasconcellos (t·c) 01:53, 22 September 2008 (UTC)”

It is validating to be embraced by the Wikipedia  community in this.What we commit to in return is to continue our efforts to expand the services and quality on ChemSpider. And presently we are working on a “little gift” to help Wikipedia. Watch this space.

I recently started a discussion with the users of ChemSpider about how they use our system. There have already been two responses and I am hoping for more. Having sat in on a IUPAC InChI meeting in Washington last week I can honestly say that it was one of the most functional and on-task meetings I have sat in on in a long time. Decisions were made about how to move forward with the next release of the InChIKey and “standard versions” of both the InChIString and InChIKey.

The meeting has prompted the question how do you use InChI? For what purpose do you use InChI and do you use only the string? Do you use it for communication purposes and structure exchange? Do you use it in your internal databases? Is it a primary path to deduplication? What settings do you use for the InChIString?

I’m interested in how you are using InChI nad how important it has become for you? Comments welcomed..

We have put in a place a simple way to associate a chemical compound in a single record view out to an external data source. We made this a general solution but did it specifically to enable connections to be made quickly between new Wikipedia records and records on ChemSpider. We have become very experienced with the validation of data on both Wikipedia and ChemSpider over the past few months so when we find new records on Wikipedia that are not already connected to ChemSpider we clean and validate structures on ChemSpider while validating the compounds on Wikpedia. Then, when we are convinced of the validity of the compounds then we connect them. While it may take a long time to validate the data associating the WIkipedia and ChemSpider records takes just a few seconds.

We have now established “Wikipedia on ChemSpider” for Wikipedia searching by structure and substructure searchable. We believe that people may be more likely to use this over WiChempedia but we will see.

The process for linking Data Sources directly to a record view is described in this Technical Note. We welcome feedback on the document in case it is difficult to follow.

I’ve blogged previously about us adding safety and toxicity data to ChemSpider. We are busily sourcing new information from other data sources to add information and in the past couple of days we have added NIOSH data as it is a rich source of additional safety information. For example, the record for 1,2,3-trichloropropane shows:

  • First Aid: Eye: Irrigate immediately Skin: Soap wash Breathing: Respiratory support Swallow: Medical attention immediately

  • Exposure Routes: inhalation, skin absorption, ingestion, skin and/or eye contact

  • Symptoms: Irritation eyes, nose, throat; central nervous system depression; in animals: liver, kidney injury; [potential occupational carcinogen]

  • Target Organs: Eyes, skin, respiratory system, central nervous system, liver, kidneys Cancer Site [in animals: forestomach, liver & mammary gland cancer]

  • Incompatibilities and Reactivities: Chemically-active metals, strong caustics & oxidizers

  • Personal protection and Sanitation: Skin: Prevent skin contact Eyes: Prevent eye contact Wash skin: When contaminated Remove: When wet or contaminated Change: No recommendation Provide: Eyewash, Quick drench

Some additional examples are here: Temefos, Warfarin and Allyl Alcohol. Note that each of these also has a coincident extract from Wikipedia. We are therefore integrating Wikipedia articles, safety, toxicity, experimental and predicted properties. Our plan for semanticising and integrating the chemistry web is clearly well underway.

I blogged previously about our intention to build a structure/substructure searchable version of Wikipedia. We declared we would call it WiChempedia. Since rolling out the new website we have had the ability to provide access to subsets of data (See Molecule of the Day and Molbank as two examples). With this newfound ability it became easier to rollout WiChempedia and the first version is now available at www.wichempedia.org.

The difference between ChemSpider and WiChempedia, for now, is the presence of the first paragraph of the Wikipedia text on the WiChempedia site and a link out to the original article on Wikipedia. An example is shown below. Notice the link to the GNU free documentation license .
wichempedia1.png

Hopefully we will receive feedback on the site quite quickly and get it out of beta at speed so please do let your colleagues know about it. We will design a new logo header shortly and we are aware that some minor types of the data resulting from the scraping process have slipped in so we will resolve those too. AN example of how much information is starting to be populated can be seen by looking at the record for Cocaine here. Here you will see the Wiki first paragraph content, a link out to a GC run on the Phenomenex website, a series of validated identifiers and an IR spectrum. The content continues to expand as we source more information

I also point you to another implementation of a Wikipedia chemistry system, chempedia.net, that you might be interested in reviewing.