Copyright©2008 Antony Williams
Over the past few months there have been multiple exchanges regarding our wishes to integrate CrystalEye to ChemSpider and the reader is referenced to those earlier conversations (1,2,3). One of those discussions resulted in us trying to get approval/permission/understanding from the ACS regarding whether or not the Supplementary Data from their articles are Open Data or not. Despite a number of emails to the ACS copyright team, two phone conversations and a face-to-face discussion with members of ACS Publications at the Spring ACS meeting there is yet to be clarity.
We had already started to scrape the CrystalEye content but had a few struggles detailed here. However, a few weeks ago Jim Downing from the Unilever School of Informatics at the University of Cambridge dropped me a note telling me that the CrystalEye collection was available for us to download and we welcomed the opportunity to take a look. After a few days of intense work the data are now available on ChemSpider and, as we had always promised, linked back to the CrystalEye pages. We are NOT linking back to the original articles but, rather, taking advantage of the fact that the data are declared as “Open Data” and therefore void of any copyright issues from the publishers. All we are doing is indexing structures and URLs from CrystalEye.
We had to make some enchancements to the system to support the CrystalEye dataset. Specifically, since MANY of the structures are organometallics and the mol/SDF file format commonly struggles with such molecules, we made a decision to download the CML and, when a normal 2D structure depiction could not be rendered, we chose to display the CML file inside the JMOL applet. This takes advantage of our new feature of displaying the molecule in the tabbed access to the structure view described here. Next we have to deal with RSS depositions from CrystalEye and some know teething problems with the deposition (See if you can spot them and let us know. We are presently waiting feedback from the CrystalEye team about whether they see any issues with our deposition). Our stats tell us that the CrystalEye deposition added over 40,000 new chemical compounds to the ChemSpider database. A nice addition to the database and an additional connection to the ChemSpider web allowing people to find crystal structures connected to chemical compounds on the database.
We thank the CrystalEye team of Nick Day, Jim Downing and Peter Murray-Rust (and others involved we don’t know) for providing access to the data. We hope that this deposition drives attention and traffic to the work of CrystalEye.Stumble it!