ChemSpider IS polluted with interesting identifiers associated with chemical structures and I have blogged many times about our efforts to clean it up. I’ve also suggested that systems such as ChemSpider, and their are many, needs an easy way to provide feedback and we have done this as discussed here. All of us hosting such large data collections deal with these issues. Today I found a classic though. A search on a CAS Number brought me to this page:

estrone1.png

The information seems fair enough but the list of names is quite amusing:

estrone2.png

These  might be a new form of “International Name”. We have had disasters just like this on our own site. At the weekend I was informed by a user of one of our structures having over 70,000 identifiers! We looked at it. It was the ONLY structure on the database with more than 300 identifiers and this one user found it. We’ve cleaned it out now. Hosting services like this is a lot of fun :-)

Stumble it!

2 Responses to “What Non-Curation of Names Gets You”

  1. Joerg Kurt Wegner says:

    Hahaha *ROFL*, well looks like a Babel-Molecule to me, maybe its a genetic variant of the Babel-Fish?
    http://en.wikipedia.org/wiki/Babel_fish

  2. Antony Williams says:

    Well, I did a search on the number 42 on our database and came up with a long list of registry numbers with 42. I was expecting some compound with the name Zaphod, or Beeblebrox, maybe Dophin? The closest I could get was Trillin (remember Trillian… http://www.bbc.co.uk/cult/hitchhikers/guide/trillian.shtml )

Leave a Reply