Archive for the copyright Category

One of the surprises when indexing the huge array of literature available on the web is that many major names, that is the ones who are associated with the traditional closed model, pop up as by far and away the biggest contributors to open access works (defined here as those that are downloadable in their entirety free of charge or other barrier such as login giving away substantial personal info).

American Society for Biochemistry and Molecular Biology (100,000+ free articles)

Royal Society of Chemistry (70,000+ free articles) – trawled, but not yet added to lit search.

National Academy of Sciences of the USA (50,000+ free articles)

The observation is that around 99% of the open access works in chemistry indexed by ChemSpider are supported financially by the subscription model, and we can suppose that open access works support subscriptions by attracting unsubscribed readers too.

As we see above, this is not theory, it has been happening for years, it is a real world material contribution to openness in chemistry that has crazily not attracted any attention on the blogosphere as far as I can tell.

There is a continued focus on relabelling data produced by others as “open data” – but this data has already been labelled and licensed by the orginal producer so this could be misleading. I’ve always thought that building searchable indices that link back, as do the major search engines, is the best way to build a resource through which users can discover works and where data producers are not undercut.

Some of the richest sources of chemical information are research group websites. Some time ago, I indexed primary literature PDFs from many such websites into the legacy (now non-existent) ChemRefer index.

I then received this correspondence from a major publisher. I submitted it to Chilling Effects to see what the various legal ins and outs of all of this meant.

Please read the letter and then the rest of this post.

It is worth pointing out here that the publisher may well have been right, but there is no way to confirm this since I am not (and should not be) able to access author-publisher contracts.

In any case, the result was that I stopped linking to research group website PDFs (the “just in case” approach). Was that the best course of action?  Comments welcome.

Having blogged on this before I think it important to emphasise that you CAN spider PubMed Central. They even have their own utilities designed specifically for the mass downloading of articles in the form of an OAI feed. What you cannot do is spider the article URLs directly (you must use the XML) because this is forbidden in robots.TXT and you will be blocked on this basis.

PubMed Central is one of the most innovative and open chemistry resources on the web with fantastic metadata and article retrieval tool sets designed to facilitate (not prevent) the spread of chemical information at no cost.

I read this post on whether DOI is a good identifier or not. My feeling is that it has the following weaknesses:

It cannot (normally) be generated from citation information (a big disadvantage for an identifier) – you have to resolve them at e.g. CrossRef. This kills it as a way to communicate articles effectively.

If you want to resolve lots of them, you have to pay (there is no real value in this.. except that they have the identifiers and you do not).

It does not replace the URL, it is simply a redirect. This makes it hard to bookmark and those unfamiliar with the system who think they have bookmarked it have in fact bookmarked the URL.

Also, publishers have to pay for it too (though its possible they may receive money from CrossRef too). Essentially, all they are paying for is an unintuitive link that does not break provided they keep the redirect up to date.

Hence OpenURL.

It creates a persistent link as DOI does except it actually exists as a webpage (it is not a redirect) and can therefore be bookmarked easily and it CAN be generated from citation information without permissions. Here is a useful implementation.

A note on the CrossRef website caught my eye. It states that OpenURL is not competitive with DOI. This, of course, is nonsense (since it addresses link permanency). Apparently:

An OpenURL link that contains a DOI is similarly persistent.” [as a link]

Why would an OpenURL pointing to a publisher website not be persistent without a DOI? OpenURL can be created with citation data so it is TOTALLY persistent. With DOI, you need to fill in a form at CrossRef or Doi.org which you do not need to do with OpenURL.

It is DOIs that need third party ‘resolving’, not URLs and especially not OpenURLs which require no link up to a database (a restricted one in the case of CrossRef) for generation.

So, it is a shame that only a few publishers have taken it up. Surely, it is a competitive advantage to use a totally freely available URL structure that anyone can generate? After all, the worst that could happen is that someone might find your articles more easily.

We have been requested to remove all RSC articles from the ChemRefer Index.
The articles in question, from 1997-2004 are marked as ‘Free access’ and, these being indexable according to the robots.txt file, formed the basis of the current indexing. The RSC are unhappy at the way their articles have been presented and linked to in our search results, and consider that the additional intended reuse of the indexed information in ChemSpider without permission violates the terms of use.

RSC will reconsider the indexing policy for ChemRefer if requested changes are made to the search results and we are presently in discussions with the RSC to identify and execute on these modifications.  All RSC articles will be de-indexed from ChemRefer during the next indexing cycle.