ChemSpider has taken some thrashing over the past year. We’ve been hit on science (and proven our point many times), on Open Access versus Free Access statements, on whether or not we have Open Data or not. There has been encouragement to define what the data on our site is in terms of Open Data or not. We’ve adopted Open Data tags on deposited data from users after pressure there. When I’ve asked more about Open Data I have heard that it is not ratified at the same level as Creative Commons licenses and they would be better to use. A week ago we put up Creative Commons Licenses in what I hoped was a GOOD move for the ChemSpider site and would relax the criticism of our site and potentially receive their blessing and support.

We received a blessing for all of 72 hours. In his blog post Peter Murray-Rust was DELIGHTED with our decision to do this. I quote: “I am DELIGHTED to report that Chemspider has adopted a CC-SA licence for its data.” and espoused “PMR: This is wonderful. As far as I know Chemspider is the only commercial chemical information company offering data under this licence, which is completely compatible with the Open Knowledge Definition. (It is also BBB-compliant, though data and publications are different animals).”

I assumed therefore we’d done a good thing. There was no indication to me that our postion was anything other than positive.

There has been a conversation going on in the blogosphere for a couple of weeks now about Strong and Weak Open Access. I’ve read, watched and simply let others share their opinions because they’ve been in Open Access discussions for a number of years and have more context, background and passion to stay engaged in these discussions. They ARE important discussions and will come to a conclusion.

It appears that “I” am confused by Creative Commons licenses. This based on the fact that 72 hours we had done a good thing and got a blessing but 3 days later I read yet another post this time with a comment  from John Wilbanks stating “I’d like to see a meaningful discussion of the risks of Share Alike and Attribution on data integration. Chemspider’s move to CC BY SA fits into this discussion nicely – it’s a total violation of the open data protocol we laid out at SC, which says “Don’t Use CC Licenses on Data” – but it does conform inside the broader OKD.”

Uh-oh. ChemSpider is in Total Violation of Creative Commons Licenses. As we say in Wales in times of distress … “Hell’s Bells” (My dad was a builder..if you believe he taught me to curse like that well….)

Peter followed it up with a comment “PMR: I agree with John. Licences are not appropriate for data (and when I applauded Chemspider it was for the motivation rather than the actual mechanism – CC-SA is conformant to the OK definition, but difficult to operate for re-use). That’s why we use the OKF’s OpenData sticker on CrystalEye.” Hmmm…

Again, when I’ve asked about the OpenData sticker I’ve been informed that this is not yet ratified.

There have been many discussions about Openness I’ve been involved with..just one example here. It has been difficult. Openness and licensing remains confusing…see here an example and this is just about a blogsite!

So the question is what now? Do we remove Creative Commons Licenses? Do we adopt Open Data licenses or do we just get ourselves out of the middle of this entire confusing discussion until all is resolved and settled. And IF we remove CC licenses and don’t post other licenses I know we’ll get criticized for that too. But let’s be honest…we’ve been highlighted for NOT having licenses up to this point. Now we are highlighted FOR having them. Maybe we can hope that no press is bad press. I’ll await feedback on this post and make a decision about what to do in the next 48 hours. Blog away…

Stumble it!

9 Responses to “It Appears ChemSpider Does BAD by Using Creative Commons Licenses”

  1. Egon Willighagen says:

    Antony, I have written down my catch on this discussion in my blog:

    Bottom line: I applaud your intention, but am also happy with the choice of license. We picked the MIT license for the BODR license, which does not have the ‘viral’ aspect of the CC, but I am happy if someone releases something as GPL too. I do not see a fundamental reason why we should treat data different from other kinds of knowledge, viz. algorithms.

  2. Unilever Centre for Molecular Informatics, Cambridge - petermr’s blog » Blog Archive » I am still DELIGHTED with Chemspider says:

    [...] is a comment I posted on the ChemSpider blog, one of two I tried to post. I’m cross posting here to make sure it’s public. Make sure to [...]

  3. Rich Apodaca says:

    A lot of Open Source projects get sidetracked in license issues and never recover. But the license rarely matters in the end.

    Good or bad only matter as applied to the quality of the service being created – Open Source has many examples. Linux is a great product with a crappy license.

    A crappy product with a great license is still crappy. A great product with a crappy license is still great.

    Ultimately, the license has little to do with why people really care about great products.

    You’ll find few complaining about Google’s terms of use, despite the fact that they’re ‘just’ re-processing the hard work of others.

    The only issue to keep in mind is that facts are not copyrightable and therefore not licensable. So discussions about the license applied to facts are totally pointless. Discussions about licensing the _expression_ of facts might be more productive.

    And discussions about what separates a fact from a copyrightable piece of work in science might be even more useful still.

  4. David Bradley says:

    I’m not chipping in on the actual discussion, you’re definitely stuck between a material high on Moh’s scale and a geological mineral deposit, it seems! But “Hell’s Bells”, I’d not heard that phrase since before I left home which is well over two decades past and it brought back a flood of memories. My dad worked in catering (and subsequentyl civil engineering) so had plenty of opportunities to exercise his right to explete ;-)


  5. David Bradley says:

    One thing I will say is that if every freelance writer/artist/whatever had to spend as much time mulling over contracts/licenses/waivers/indemnity clauses as the lawyers would like us to do, very few freelance efforts would ever see the light of day…


  6. will says:

    I think this licence issue is a bit of a distraction wrt removing access barriers to science.

    Many publishers have been providing free access to tens/hundreds of thousands of articles for years e.g. Natl Acad Sci USA.

    Yet, this seems to go unnoticed because they concentrate on publishing and don’t want to weaken their copyright (an issue which, to me, has nothing to do with access anyway).

    It may not be the case that weakened copyright strengthens open access because it may remove the motivation to publish the work in the first place.

  7. The Open Data licensing issue : business|bytes|genes|molecules says:

    [...] Williams announcing CC support for data on ChemSpider. That was followed by a chain of events and a ton of confusion. Let me add my voice to this debate, since Open Data is near and dear to my [...]

  8. Chris Singleton says:

    So will this affect data depositions at all, since a lot of us usually deposit as open data? I prefer that anything I deposit be available for anyone to use with few restrictions, will this still be the case?

    And as a side note, my mother used to use the phrase Hell’s Bells. I’m from the southern U.S., so no idea where that came from in my case!

  9. Egon Willighagen says:

    Chris, yes, anyone can use the data for screening etc. It gets more complicated, however, and only if, you start mixing ChemSpider data (or CC-SA-BY data) with data licensed differently. That’s something quite common to software, though, and people got used to that.

    The SC suggests to solve those data aggregation problems, by waiving all rights, such as attribution and share-a-like. The upsides of that is that querying a database becomes much easier, because one does not have to worry about finding all the involved licenses, and issues about license incompatibilities when combining things into a query result.

    So, as long as data users keep up with the share-a-like and attribution, one has every right to use, redistribute and change the data.

Leave a Reply