…. and it did. But not quite in the way that Cambridge had imagined. Over the last few weeks around 200,000 articles from contributing publishers have been added to ChemSpider’s literature search (as ChemRefer is now styled), though even this is not in the final form which we imagine.

Another 40,000 articles or so are following next week as this resource grows. The indexer is running hot 24 hours, seven days a week. Tens of thousands more articles will follow after that and on top of that we now have the capability to index text from image PDFs (many journal articles are still in this form) which that also opens up the possibility of users sending in scanned images of their data rich documents as a form of submission of chemical information to ChemSpider as well.

The main issue now is not having the time/resources to index everything we have permission for, we have still barely scratched the surface of Highwire for instance and adding updates from the resources we already index is not yet implemented properly. But, these are nice problems to have.

When we do have the critical mass of text journal articles indexed, the “cited in” feature can be implemented and we can open up the chemical names from the indexed content for downloading and curation by the ChemSpider community… and that’s when things get really interesting.

We are still on track, with just scant resources, to create a community curated cheminformatics-text search that we hope will eventually gain unstoppable momentum thanks to our community backing. Mozilla Firefox competes with Microsoft’s Internet Explorer because it has user and developer community backing and that is worth consideration as a role model for ChemSpider and the chemistry world as a whole.

The turn around that has occurred in terms of the interest in having published materials text indexed is highly significant in the long run since thousands of references will pour into ChemSpider structure records to enhance the usefulness of the database.

These, of course, will be free for anyone to download, so will make a material contribution to the openness of chemical data (which is what I want Open Chemistry Web to be all about) as opposed to talking about definitions/licenses/copyrights and other such distractions (as I see them) surrounding open access and open data.

Stumble it!

3 Responses to ““Chemrefer could disappear tomorrow””

  1. ChemSpider Blog » Blog Archive » Will ChemSpider go the same way as ChemRefer says:

    [...] Will Griffiths has posted at Open Chemistry Web a post entitled “Chemrefer could disappear tomorrow“. [...]

  2. Joerg Kurt Wegenr says:

    What is the status of ‘in citations’? I found some information via
    http://www.chemspider.com/open-chemistry-web/citations.html
    but I guess that some things have happened since then.

    I tried looking for ‘tmc114′ and was astonished that the search field in the literature search has no autocompletion, but the structure search has. So, the text field can almost connect already ;-)

    BTW, is there any way in adding tags or keyword to articles? Potentially this might cause spamming, but the same argument could be made for any social service and can be anyway curated. Anyway, on the long-term this might add many use-cases, e.g. keyword matching against structures.

    Finally, if a structure entry contains a DOI or PMIiD, is the system linking this to the already indexed articles are double-checking against PubMed, CiteULike, or Connotea?

  3. will says:

    Arg, sorry for the late reply/moderation of your comment.

    This was not implemented at the time because we did not index widely enough for it to be at all useful. However, recent developments have pushed us over the quarter-million articles indexed mark and I think/hope to deploy this soon.

    Text searching has no autocomplete and maybe this should be deployed also but, unlike with structures, it is not as implicit what a searcher may be referring to when typing in terms (as the vocabulary/context is so much more uncertain). But we should look at this.

    Tagging articles is a curious idea, ChemSpider has developed spam filters to deal with this for structure record curation so why not for articles?

    Finally, the ultimate goal is to have joint structure-text searching which would solve the issue of article association with strcuture be it with DOI, PMID or URL.

    A feature to add articles to services like CiteULike/Connotea is also a great idea.

Leave a Reply

Spam protection by WP Captcha-Free