A few months ago I met with Adam Azman in Chapel Hill to discuss how the names in our ChemSpider database could be used to expand his Chemical Dictionary. It seemed that we would be sitting on a treasure trove of name fragments that could help him in his efforts. So, we supplied Adam with 1.3 million identifiers and Adam has worked for the last few months to generate his Chemical Dictionary. He extracted over 100,000 name fragments from our collection as he has described in his blogpost here.

Extracted from Adam’s blog are his so-called Administrivia “The dictionary is licensed under the Creative Commons Attribution 3.0 License.  …  The dictionary is compatible for Microsoft Office (Windows or Mac), and  Open Office (Windows or Linux).  The install file includes instructions for upgrading old versions and installing it for the first time.  The dictionary should be useful for all chemists.  However, I am an organic chemist.  Thus, the dictionary was created from an organic chemist’s mindset.  It will probably be most useful for organic chemists.”

Adam has explained in detail how he did the work. I encourage you to read his post to fully understand the nature of the work and how much heavy-lifting he actually did.  It’s been a pleasure to help Adam and the community by supplying our own form of a “dictionary” to him for his particular treatment. It took a few hours of work from our side and months of hard work from him. I encourage you to take advantage of his efforts…if you are a chemist this is a real gift for the season. The dictionary can be downloaded from our site here.

Now I want you to consider timing. We are working hard on our ChemMantis project, a system for entity extraction and document markup. Part of this includes the generation of dictionaries for finding chemical names. We’ve already expanded our chemical dictionary using the database of identifiers from ChemSpider but for those of you working with other systems such as OSCAR3 or the other commercial markup systems dependent on chemical dictionaries you will likely find Adam’s contribution significant. Enjoy.

Stumble it!

5 Responses to “A Chemical Dictionary from Adam Azman with Help from ChemSpider”

  1. Chemistry Blog » Blog Archive » Chemistry Dictionary for Word Processors - Version 2.0 says:

    [...] Through David, I was introduced to Antony Williams from chemspider.com. I met with him one afternoon in February, and he agreed to release his database of 1.3 million identifiers for me to integrate into the next upgrade. (Update: read Tony’s writeup here) [...]

  2. azmanam says:

    Thanks for the link and the kind words. And for all your help.

  3. Joerg Kurt Wegner says:

    Is it possible to provide for the very same chemical name list also a chemspider identifier list? This could be a fantastic starting point for word, openoffice, google docs, or any other kind of mash-up with chemical content.

  4. Antony Williams says:

    The majority of the dictionary is made up of chemical name tokens, not full names but some WOULD have associated CSIDs. I’ll discuss with Adam..

  5. Joerg Kurt Wegner says:

    I guess in this case a substructure search identifier would be better ;-)

Leave a Reply