The first part of the first build of the Open Chemistry Web project is now available for viewing and testing here.

It lacks the advanced and substructure capabilities at the moment but these are well on the way. Currently, it more closely resembles the old text search over at and we have actually been asked to preserve that metadata format although there are soem changes already implemented.

These include an effort to clarify metadata by standardising citation data to (or as closely as possible to) Journal name, Year, Volume, Issue, and page (all explicitly stated).

The idea behind this is that people will take citations from the primary source (not ChemRefer) so citations in search results should serve only to be as clear and easily readable to the user as possible.

Soon to implemented (for some publishers) is Digital Object Identifier linking – at the request of the publisher so far. Search engines periodically refresh all their links anyway so the link permanency issues that apply to databases (which DOI solves) do not apply here and so their is no policy on this at the moment.

There will be a SIMPLE user interface. One text box, one applet on one page (preferably with very little else). We want to be addictively usable and deliver useful search results quickly. We do not want to build some all-singing-all-dancing and yet overly complex system that no-one without a Masters in cheminformatics will ever be able to decipher.

There are around 150,000 articles on the new index in comparison to ~50,000 in ChemRefer’s index of 12 months ago. Around half are open access (meaning you can download the full work in its entirety for free), and the full text of articles have been indexed to maximise the depth of the search (so even if you cannot access the full text for free, you are still searching the full text).

There is an enormous analytical and life sciences bias at the moment but these are often the most searched for chemical topics on the web due to their scope and importance.

For general interest, ChemRefer differs in structure from ChemSpider in that it is a search engine not a database. That means:

- ChemSpider exists as a website: you can link to it, bookmark it etc. Its purpose is to refer you to useful and curated resources but also to provide information on the web resource

- ChemRefer is just a searchable index. You cannot link to ChemRefer (unless you want to link to constantly changing search result pages). Its purpose is to get you off the website and to the useful primary source. Articles and metadata are spidered but this is dynamic so can hardly be described as curation. Systems have been set up to allow the curation of chemical structures from this raw full text index into ChemSpider in an accurate way but also quickly (luckily Tony Williams is a human Xerox). ChemRefer also now serves not just as a full text indexer, but also to mass harvest chemical data from selected web resources and deliver it to ChemSpider.

So, the robot is often used to deliver the data for curation such that it can be processed not (as I initally assumed) just to be fed into the Name-to-structure conversion software necessarily.

Any and all feedback welcome.

Stumble it!

One Response to “Work in Progress”

  1. ChemSpider Blog » Blog Archive » ChemSpider Enables Direct Integration to IUCR Articles says:

    [...] recent rollout of the first level of integration and Will Griffiths has started a discussion at the Open Chemistry Web blogpage. Will has played a key role in facilitating our relationships with publishers in the Open Access [...]

Leave a Reply

Spam protection by WP Captcha-Free