02
10
2007
Is There Interest in Seeing ChemSpider Link to CrystalEye?
Posted by: Antony Williams in UncategorizedCopyright©2007 Antony Williams
CrystalEye is a project running at the University of Cambridge/Unilever School of Informatics and the screenshot below is self-explanatory.
Based on what I have seen Nick Day is doing a fine job. As of September 2007 there are over 100,000 CIF files aggregated already. The data are all labeled as Open Data so we could index the related 2D structures and link out to CrystalEye. I’m interested in whether our users have an interest in us linking up?

Entries (RSS)
October 3rd, 2007 at 3:33 pm
Yes, there is very much interest. Chemoinformatics is not able to cover 99% of chemistry, likely more in the range of
October 5th, 2007 at 8:04 am
Egon’s comment was truncated for some reason…he summarized in an email…
1. yes, make the link. 3D coordinates are enormously important chem props. Maybe even copy/paste into ChemSpider. 2. are you already using the CrystalEye CMLRSS feed to enrich ChemSpider with new molecules from crystal structures. If not, please do
October 12th, 2007 at 8:57 pm
I agree that 3D coordinates are pretty vital, and if there is another information-rich chemical database, I’m all for linking and consolidating knowledge.
October 14th, 2007 at 11:02 pm
I have asked Peter tonight where to find the CrystalEye Data to download..
http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=696#comment-58286
October 15th, 2007 at 2:44 am
[...] if they are Open Data, as marked at the CrystalEye website, and seeing as though people would like to access the data via ChemSpider, we should just be able to download. But, we don’t want all the data..we just want the [...]
October 26th, 2007 at 11:44 am
This email sent to the editors of JACS on 10/26/2007
Colleagues,
Please excuse addressing to colleagues but as seen at http://pubs.acs.org/journals/jacsat/editors.html this email goes to many parties.
I am the host of ChemSpider, an online resource for chemists. http://www.chemspider.com. For an overview of what we are doing please visit: http://www.chemspider.com/docs/ChemSpider_Overview_SLides_August_2007.pdf
I am presently considering utilizing the data from the CrystalEye online database as I have outlined here: http://www.chemspider.com/blog/?p=191
The CrystalEye database is run from the University of Cambridge by Professor Murray-Rust. I have looked at the sources of data populated on the database and see that there are a number of ACS journals represented there, including JACS. Please see http://wwmm.ch.cam.ac.uk/crystaleye/summary/index.html
I am seeking confirmation that if we scrape the data from the CrystalEye database and populate onto ChemSpider that we will not be breaking any copyrights. I have asked the question here to Peter: http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=737#comment-62799 and he has answered. I am now seeking your confirmation that it is appropriate for me to access the data since this is marked as Open Data at Peter’s site. I welcome your comments. Thank you
October 26th, 2007 at 9:45 pm
In order to keep this effort documented for all. An exchange from PMR’s blog:
ChemSpiderMan Says:
October 26th, 2007 at 12:01 am
Peter, I asked previously about how to obtain an SDF file of the structures on CrystalEye so that we could link to CrystalEye records via ChemSPider. This was based on my question to the community at
http://www.chemspider.com/blog/?p=191
Your comment was that the data was Open but that an SDF was not available and we should scrape the data. I was looking at this possibility today. I was pleasantly surprised to see a number of the journals listed included ACS journals and Elsevier journals (http://wwmm.ch.cam.ac.uk/crystaleye/summary/index.html). There has been a lot of traffic of late about their Open Access policies but now I see that they are supporting your Open Data efforts. This is excellent. I would like confirmation that they are aware of the Open Data posted from their journals before we scrape them. Are they aware? I want to make sure I am respecting all parties. Thanks
pm286 Says:
October 26th, 2007 at 7:54 am
(1) All data come from Free sources – i.e. visible without a subscription. Some journals (Acta Crystallographica and RSC for example) do not copyright the data. Others like ACS add copyright notices. It is our contention, and Elsevier has agreed for its own material, that facts are not copyrightable. We have therefore extracted and transformed facts and mounted these. Where the original material (CIF) does not carry copyright we mount it on our pages – where it does we do not, but we have the transformed data. In those cases it would be possible to recreate the original CIF data in semantic form ,but not the exact typographical layout which contains meaningless whitespace.
I am not aware that ACS or Elsevier have ever made statements of any kind about our Open Data efforts.
You may scrape anything, must you must honour the source and the metadata and you should add the Open Data sticker. If you scrape the link (simplest) you may simpy point to our site. If you scrape more data you should ensure that the integrity of the data is maintined and that if it is re-used the re-used data should still clearly show our metadata.