Archive for November 5th, 2008

For those of you performing curation activities on ChemSpider you will likely have noticed the ability to mark a new type of identifier, a shorthand formula. We have enabled this because it has become clear that this could be a useful part of document markup as part of our ChemMantis system. For example, looking at an article let’s consider the excerpt shown below.

Regarding the excerpt you can see a number of highlighted terms, all being shorthand formulae and not depending on name to structure conversion algorithms but rather depending on a lookup dictionary. Each of these names are linked to ChemSpider for direct look up of information associated with the chemicals. The list of shorthand formulae extracted from a couple of hundred articles is actually only a couple of hundred formulae at present. It includes the most obvious compounds that we can all interpret: CH3OH, MeOH, CH3CN, MeCN, CH3COOH, NaCl, NaF, NaCN, KBr, KCl and so on. All of these are immediately interpretable by chemists. There are likely a few more to be found over the coming months but in the past week of reviewing articles from various sources we have actually only added a couple of new formulae. We have also seen value in linking up ions and elements as appropriate. We are likely to add filters for display/not display of elements and ions since we’re of the opinion that displaying every incidence of an element in an article is of luttle value…just imagine how many times you might see the word carbon or hydrogen in an article… carbon-carbon bonds, hydrogen bonding etc. So, we’re switching them off by default. We’ll keep reporting on how we are improving ChemMantis…based on the review of a stack of articles the system has improved dramatically. We are asking for your articles now…combining shorthand formulae and chemical name markup will highlight a document as shown below.

Buy me a Coffee