I have given a number of talks regarding ChemSpider over the past few months and generally comment “ChemSPider hosts almost 21.5 Million unqiue chemical entities from over 200 data sources. As of today it is over 21. 5 million chemical entities. We have deposited data from a number of new contributors of late, many of these are smaller chemical vendors such as Bridge Organics and ExtraSynthese. However, we recently crossed the 21.5 million mark because we have started to take advantage of the eMolecules dataset made available as a downloadable set. There are over 5 million structures in the dataset.

Many, but not all of these, deduplicate onto the ChemSpider database. The 21.5 millionth structure links to this record on eMolecules as shown below.


When the data are added onto ChemSpider we automatically add SMILES, InChIs, MW, MF and a series of predicted physicochemical properties. This is for the new structures from eMolecules. In many cases however eMolecules is simply one more data source among many and information such as spectra, Wikipedia links, experimental data etc are all integrated. In this case though eMolecules can help you source a vendor for the material as is their strength.

Stumble it!

2 Responses to “21.5 Million unique chemical entities and growing”

  1. Craig James says:

    We’re pleased that your users now have access to our structures.

    One suggestion: The “cgi-bin/more” URL that you are using does not work correctly with the 2D data set that you downloaded. The eMolecules.com IDs are for the “parent” structures (the structures without salts or solvates), but all of the data are associated with specific “version” structures (the specific compounds from the original SD file).

    We’d like to suggest that you change the URL in your link to use “cgi-bin/search?id=nnnnnn”? For example, http://www.emolecules.com/cgi-bin/search?id=483009 will find all of the structures associated with the parent cyclohexane.

    Best regards,

  2. Antony Williams says:

    Craig..thanks for the pointer. The IDs have been changed and the links provide much richer information now. Thanks for the guidance…much appreciated!

Leave a Reply