Archive for April, 2008

Having blogged on this before I think it important to emphasise that you CAN spider PubMed Central. They even have their own utilities designed specifically for the mass downloading of articles in the form of an OAI feed. What you cannot do is spider the article URLs directly (you must use the XML) because this is forbidden in robots.TXT and you will be blocked on this basis.

PubMed Central is one of the most innovative and open chemistry resources on the web with fantastic metadata and article retrieval tool sets designed to facilitate (not prevent) the spread of chemical information at no cost.

HighWire hosted journal texts are to be indexed and linked back to by ChemSpider and structure records linking to their content be deposited here as well. HighWire will be indexed in accordance with their robots.TXT protocol (the conventional web publishing standard for stating indexing permissions).

From the website:

“HighWire-hosted publishers have collectively made 1,873,044 articles free” [and with their partner publishers] “produce 71 of the 200 most-frequently-cited journals.”

We would like to thank them for one of the most phenomenal academic publishing indexing/structure deposition permissions we have received and we expect it will greatly enhance the discoverability of their partner publishers’ works through our free cheminformatics and text search.