I blogged previously about our intention to build a structure/substructure searchable version of Wikipedia. We declared we would call it WiChempedia. Since rolling out the new website we have had the ability to provide access to subsets of data (See Molecule of the Day and Molbank as two examples). With this newfound ability it became easier to rollout WiChempedia and the first version is now available at www.wichempedia.org.

The difference between ChemSpider and WiChempedia, for now, is the presence of the first paragraph of the Wikipedia text on the WiChempedia site and a link out to the original article on Wikipedia. An example is shown below. Notice the link to the GNU free documentation license .

Hopefully we will receive feedback on the site quite quickly and get it out of beta at speed so please do let your colleagues know about it. We will design a new logo header shortly and we are aware that some minor types of the data resulting from the scraping process have slipped in so we will resolve those too. AN example of how much information is starting to be populated can be seen by looking at the record for Cocaine here. Here you will see the Wiki first paragraph content, a link out to a GC run on the Phenomenex website, a series of validated identifiers and an IR spectrum. The content continues to expand as we source more information

I also point you to another implementation of a Wikipedia chemistry system, chempedia.net, that you might be interested in reviewing.

  1. Egon Willighagen says:

    Thanx for the GFDL in the section bar… would be nice to have such license indication (maybe in smaller form) on all sections…


  2. Antony Williams says:

    I believe that the appropriate Creative Commons license works for all fields except for the Predicted PhysChem properties http://creativecommons.org/licenses/by-sa/3.0/us/

    What do you think? I will approach the providers of the software tools regarding their PhysChem properties and think that their answer will likely allow me to use this same license also. Comments?

  3. Egon Willighagen says:

    Antony, it’s not so much of having open data, though I’d appreciate such a step, of course.

    But, what I particularly liked about having the license in the bar, is that it is now clear what I am allowed to do with the website content. As a reader of Peter’s blog, you know that such unclarity makes progress in the field difficult; that is, being unsure if you may or may not use, is causing more trouble than not being allowed to use certain bits of data.

  4. Antony Williams says:

    Yes, I can only agree with your comments about the unclarity. I think you watched the many exchanges about the CrystalEye data http://www.chemspider.com/blog/struggling-to-scrape-crystaleye.html). My opinion on that is there’s less value in the Open Data label if you can’t actually grab the data very easily. There were a lot of exchanges about how an archive or available dump would be available but we consumed many hours having to scrape it then. Then there’s the issue of ACS not yet answering the question about whether the data Peter has on CrystalEye can be declared open or not (http://www.chemspider.com/blog/intention-to-scrape-crystaleye-content-and-staying-in-relationship-with-publishers.html#comment-8942).

    As you may recall I was asked to meet with ACS Pubs in New Orleans witha 5 month delay (http://www.chemspider.com/blog/why-we-cant-publish-scraped-crystaleye-data-yetand-science-commons-declare-a-protocol-for-implementing-open-access-data.html). I did so. I sat with them for about half an hour and answered a series of questions about ChemSpider..if anyone would like details contact me offline. When I asked the question about scraping CIF data and whether it was open it was a “We don’t have an answer you yet” scenario. SO, I have to wonder whether the Open Data label is valid…so yes, that is VERY unclear to me!

    I’ve looked at the Metware schema (http://metware.wiki.sourceforge.net/). Looks good. I’m hoping we can connect to you even in the early stages. Please include us in your plans. Will the data be Open or how will it be licensed? Thanks

