Archive for the Uncategorized Category

A peek at who we are, how we run the site, and how we manage data quality.

What is ChemSpider and who runs the service?

ChemSpider is one of the largest chemical databases in the world, containing data on over 65 million chemical structures. This data is freely available to the public at, a website published by the Royal Society of Chemistry.

How does the Royal Society of Chemistry support ChemSpider? is an independent service that does not rely on direct or research grant funding. The Royal Society of Chemistry supports the website using the surplus generated by our publishing activities, allowing us to provide a sustainable and reliable service. We also generate revenue from advertising and by providing paid for web services, such as our APIs, for non-academic users. These activities help keep ChemSpider financially sustainable and help support our server costs, staff hours and development.

These services enable us to make the site available free anyone in the world, and we reached over six million unique users in 2017. These users range from school students looking for help with their homework, to researchers working in academia and industry, to general users who want to keep their chemical knowledge up to date. They come from every continent except Antarctica, and just about every country on Earth.

What goes into ChemSpider?

ChemSpider data comes from the chemical sciences community itself – submitted by researchers, databases, publishers, chemical vendors and many more.

We have two main inclusion criteria for ChemSpider data:

  1. Machine readability – Depositors must provide structures in a machine-readable format, typically a .mol file that is interpretable by InChI – the open-source chemical structure representation algorithm.

The .mol format describes how a compound is arranged, atom-by-atom and bond-by-bond. This means that it can only accurately depict small molecules with defined structures. For ChemSpider, “small” means structures up to 4000 daltons, including short peptides, oligonucleotides, and other structures. Large proteins, extended crystal lattices or long nucleotides are too big to describe sensibly in ChemSpider, but are available from other databases suited for larger molecules.

We also only accept ‘defined structures’ – compounds with exact chain lengths, fully expressed functional groups, and integer bond orders – due to the requirement to describe every heavy atom in a molecule. This means we can only accept structures for which we can generate a valid InChI.

Most ChemSpider structures are organic molecules. However, we do accept some inorganic and organometallic compounds, with specific methods for curating these.

  1. Real compounds – We do not accept virtual or prophetic compounds.

As far as possible, we only accept compounds that have been synthesised or isolated in physical form. This means we do not accept transition states, theoretically predicted compounds, virtual compounds from vendors or prophetic compounds from patents.

Who are our data sources?

We have received data from almost 250 unique data sources, including data from chemical vendors, specialist databases, individuals, research groups and publishers. These sources cross the breadth of the chemical sciences – including biochemistry, pharmacology and toxicology, natural products, spectroscopy and crystallography. Each ChemSpider record includes links to all of the data sources for the compound, enabling users to find and to check the provenance of the data.

Our data source list is continually changing, as we find new sources of data to add and remove outdated or low-quality data sources.

We no longer accept data from other data aggregators. We have taken this step to match our quality requirements with other databases and reduce the propagation of algorithmically generated errors that can arise from prophetic sources. One example of this is Chessboardane, which originated from an optical structure recognition program interpreting a data table contained within a patent as a chemical structure. The result was an 81-carbon grid structure, erroneously identified as a complex cyclic alkane, which was deposited in a public repository and shared between multiple aggregators.

Because of this, we only seek data directly from the original sources, where we have greater certainty about the data’s provenance and accuracy, and are working to curate legacy data still within ChemSpider.

Because of examples like Chessboardane, we are cautious about accepting data from text-and-data-mined sources that depositors have programmatically extracted from text or encoded images in patents or scientific literature. After review, we have added some of the highest quality data mined sources. We will continue to review potential new data-mined sources on a case-by-case basis to ensure that their data meet our quality standards.

Automated filters

A manual check of every one the 65 million records in ChemSpider would take an individual more than 600 years to complete working round the clock – even if we only invested five minutes of curation time per record.

Instead, we run each deposition through a series of automated filters to pick out unsuitable structures, such as those with incorrect valences, unbalanced charges, or missing stereochemistry. In addition to structure filters, we also apply basic name and synonym filtering and regularly review the processed files so that we can improve our filters.

We have provided a simplified overview of this process below, and will provide a more detailed description of our filters in a separate blog post:


Curation by ChemSpider staff

ChemSpider is run by a small team of full-time curators, who work to add new compounds, remove errors, and respond to user feedback. Our staff have extensive experience of both chemical data and practical chemistry, with backgrounds in fields such as organic synthesis and art conservation, and a wealth of experience working on other Royal Society of Chemistry databases, such as The Merck Index* Online and Analytical Abstracts.

Community curation

Because we cannot review every record ourselves, we really appreciate comments or corrections from our users.  The easiest way to help us improve ChemSpider is to leave feedback or email us when you spot an error. We try to act on user feedback within a few days – sooner for simpler queries. Please let us know if you find an error by leaving a comment on the relevant ChemSpider record, or by emailing us (

Users wishing to get more involved can directly deposit structures and curate synonyms related to their research or work, without having to email the ChemSpider team.

We are extremely grateful for all the contributions our community curators have made over the years.

Keep using and contributing to ChemSpider

To access information on over 65 million chemical structures, go to, which is fully searchable by structure, name, or advanced query, from any device, anywhere, for free.

To deposit data, tell us about an error, become a curator, or for any other query, please do not hesitate to email us at

*The name THE MERCK INDEX is owned by Merck Sharp & Dohme Corp., a subsidiary of Merck & Co., Inc., Whitehouse Station, N.J., U.S.A., and is licensed to The Royal Society of Chemistry for use in the U.S.A. and Canada.

Following on from our previous blog post about extracting chemical structures (as mol files) from their crystal structures (CIF files) in the RSC archive using OpenBabel, it transpired that the Crystallography Open Database (COD), were conducting a similar project to extract the chemical connectivity (in SMILES format) from their large collection of openly accessible CIF files using OpenBabel. This opened the possibility of linking ChemSpider to COD (and vica-versa) by comparing these SMILES with ChemSpider structures and has resulted in 34,768 new links being made, each with a corresponding CIF in ChemSpider.
ChemSpider-COD linking example
At the beginning of February there were 262,817 CIFs in COD, of which 78,473 had been converted into SMILES (numbers which have been increasing daily since then). We downloaded these SMILES and performed webservice structure searches of ChemSpider on them all using the StructureSearch operation of the ChemSpider Search webservice. Those SMILES which were not currently in ChemSpider were converted into mol files using OpenEye and reviewed by a ChemSpider curator with a view to depositing the suitable structures into ChemSpider as new compounds. The curation meant that we have been able to provide feedback to COD about SMILES that look suspicious and as if there may have been a problem with the conversion process – for example charge and radical issues, undefined stereochemistry for sugars, missing stereochemistry and the duplication of molecules or fragments within the same CIF. Since ChemSpider is primarily a collection of small organic molecules, many of the large number of metallorganic complexes were omitted simply because they weren’t within our scope.
After the deposition of the suitable new compounds, we identified 34,768 ChemSpider compounds which corresponded to COD crystal structures. These links have been added in the “Datasources” infobox under the “Spectral Data” tab, and the corresponding CIF added to ChemSpider so that it will show in the “CIFs” infobox with a link to the relevant COD webpage.
An example compound that has been linked to COD is Ibuprofen (ChemSpider ID 3544) which has been linked to The reciprocal links are due to be added to COD shortly.
We would like to thank Miguel Quirós Olozábal (COD) for his help and cooperation with this project.

We previously reported an initial proof of concept of an “Insert from ChemSpider” TinyMCE plugin which was integrated with Southampton University’s ELN LabTrove to add compound images from ChemSpider to posts. We are pleased to announce the version 2 of that plugin which now allows a bench chemist who is planning or reporting a reaction to construct a stoichiometry table of the chemicals used and produced in it, as shown at the bottom of this post. Constructing one of these tables manually can be a tedious and error prone task, but now when a LabTrove user is writing up an experiment post about a reaction, they can click on the “Insert from ChemSpider” TinyMCE plugin button in the editor which guides them through the task and retrieves compound properties from ChemSpider so that they don’t need to draw out the compound, or type in its name, molecular weight or formula mass. The amount of each substance can be expressed in a number of different ways – ratio (equivalence), number of moles, mass or volume (depending on the compound state and reaction role), and the user only needs to enter one of these properties and it is automatically inter-converted into the other relevant properties (including calculating equivalents). The product yields are also calculated (as the percentage of the amount actually recorded compared to the amount calculated from the ratio of the product and the amount of the limiting reactant).

Another feature is that it is possible to construct advanced stoichiometry tables which are initially created during a planning stage (during which planned amounts of reactants and products are entered and calculated), but with a separate column to add actual amounts of reactants and products at a later stage.

This functionality was used as part of the intern project to share compound and reaction data in LabTrove and ChemSpider to create an example reaction page in LabTrove. The top of that page was made using the ChemSpider Reactions template, and the table at the bottom with the “Insert from ChemSpider” plugin.

A demonstration video showing how to use this new functionality is shown below:

The TinyMCE plugin relies on the new ChemSpider “Edit Stoichiometry Table” jquery widget which contains all of the functionality behind it. The widget can be used independently of LabTrove, for example in a ChemSpider widget example page, and as such can be easily integrated into different ELNs and websites. We will also be using it in conjunction with the imminent ChemSpider Reactions platform to allow upload of stoichiometry table data to be hosted on there with other reaction data. To allow the widget to be flexible and used by different applications for different purposes, after a stoichiometry table has been created, the widget allows it to be retrieved either as pure html, or a json string (which can be accepted as an input option for the widget to display and edit an existing stoichiometry table) or a html table with the json embedded as a data attribute in it. The latter option allows a stoichiometry table to be added to a webpage with the option to edit it at a later point.

This is the first version of the edit stoichiometry table widget, and it will be used and tested and revised accordingly. Further developments are also planned for the “Insert from ChemSpider” TinyMCE plugin used in LabTrove.

If you would like to to integrate the ChemSpider “Edit Stoichiometry Table” widget with your website, web-based ELN, TinyMCE editor (e.g. WordPress), or LabTrove installation to use and test it and provide us with feedback then please contact us at

Output of ChemSpider “Edit Stoichiometry Table” widget: Stoichiometry Table of Substances Used/Produced

Compound Information Substance Information Planned Amounts Actual Amounts
ChemSpider ID: 9162369
Name: 5-Iodo-1-pentene
Formula: C_{5}H_{9}I
MW: 196
compound image
Safety Information:
Role: limiting reactant
State: solution
Molarity: Moles/L
Concentration: 7.95 g/L
Solvent: THF
Ratio: 1.00
Amount: 0.00500 Moles*
Mass: 0.980 g
Volume: 0.123 L
Ratio: 1.00
Amount: 0.00500 Moles*
Mass: 0.980 g
Volume: 0.123 L
ChemSpider ID: 7789
Name: 3,4-Dihydro-2H-pyran
Formula: C_{5}H_{8}O
MW: 84.1
compound image
Safety Information:
Role: reactant
State: solution
Molarity: 11.79 Moles/L
Concentration: g/L
Solvent: THF
Ratio: 2.80
Amount: 0.0140 Moles*
Mass: 1.18 g
Volume: 0.00119 L
Ratio: 2.80
Amount: 0.00750 Moles*
Mass: 0.631 g
Volume: 0.000636 L
ChemSpider ID: 10254347
Name: t-BuLi
Formula: C_{4}H_{9}Li
MW: 64.1
compound image
Safety Information:
Role: reactant
State: solution
Molarity: 1.7 Moles/L
Concentration: g/L
Ratio: 1.60
Amount: 0.00800 Moles*
Mass: 0.512 g
Volume: 0.00471 L
Ratio: 1.60
Amount: 0.00600 Moles*
Mass: 0.384 g
Volume: 0.00353 L
ChemSpider ID: 29341335
Name: 6-(4-Penten-1-yl)-3,4-dihydro-2H-pyran
Formula: C_{10}H_{16}O
MW: 152
compound image
Safety Information:
Role: product
State: liquid
Purity: 100%
Ratio: 1.00 *
Amount: 0.00500 Moles
Mass: 0.761 g
Ratio: 1.00
Amount: 0.00456 Moles*
Mass: 0.694 g
Yield: 91.2 %

(* indicates entered value)

We have just uploaded three new short video tutorials walking you through how to search, comment on, and submit ChemSpider Synthetic Pages procedures.

An introduction to searching ChemSpider Synthetic Pages

Reading and commenting on ChemSpider Synthetic Pages articles

Learn how you can share your work on ChemSpider Synthetic Pages

We welcome any feedback, as well as suggestions for topics for new help videos.

This summer there have been a number of students from the University of Southampton doing internships on joint projects between the university and the Royal Society of Chemistry and ChemSpider. Three of these students have been sifting through theses from past members of Richard Whitby’s research group in order to extract the compound, spectra and reaction data in it (and linked lab note books, and archive spectra files) and share these in LabTrove, ChemSpider, and CSSP. The students – Alex Hartke, Yet Wai Lee and Josh Whittam (all 2nd year undergraduates) – are shown below together with the boxes of thesis data, lab notebooks and spectra print outs that they digitised.
Southampton University Interns
Between them they digitised 7 theses, by A.Henderson, L. Sayer, D. Owen, D.Macfarlane, F. Giustiniano, G. Saluste, J. Stec, which resulted in 1035 LabTrove pages being published to the Whitby Group’s LabTrove blog.

The theses were a rich source of compound information – including compound structures, names, properties and spectra, all of which were also deposited into ChemSpider resulting in 208 new compound pages, and about 600 spectra.

For this project the students manually deposited the compound information into LabTrove and then deposited the compounds and spectra to ChemSpider. However, we are currently developing a range of ChemSpider jquery widgets which can be integrated into web-based ELNs such as LabTrove which will make it easier to enter compound information from ChemSpider into experiments, and also to publish compound and reaction data from the ELNs to ChemSpider, CSSP and ChemSpider Reactions. This will follow on from the initial proof of concept to retreive ChemSpider information and enter it into LabTrove pages.

With this long-term aim in view, the LabTrove pages that the interns stored the compound and reaction data were structured using LabTrove templates, and this structuring will make it easier for publishing widgets to understand the data and process it the correct way. In this way, the project was partly a test to ensure that the templates were suitable for storing compound data in LabTrove. As well as the ChemSpider compound and associated data template (with corresponding help page, templates were also written to store reaction data in a formatted way, since the theses were primarily focused on the synthesis of compounds. At their simplest, basic reaction data can be stored in LabTrove using the ChemSpider Reactions template (and corresponding help page, and eventually posts written in this format will be easily publishable to ChemSpider Reactions. More detailed reaction data can be stored using the ChemSpider SyntheticPages style reaction template (and corresponding help page. The initial aim was to deposit all of this reaction data into ChemSpider SyntheticPages but it became clear that it was difficult for anyone other than the researcher who conducted the reaction, or their superviser to supply the necessary level of detail for CSSP submissions, and in particular couldn’t easily be reached by retrospectively abstracting theses. As a result, only a handful of reactions were submitted to CSSP, and the majority (over 500) were stored in LabTrove for future submission to ChemSpider Reactions.

If reactions can be published easily from ELNs to ChemSpider Reactions and that is easily queryable by other researchers and their applications when performing new reactions this will be a major step towards the aims of the Dial-a-molecule (an EPSRC Grand Challenge network). An important part of the reaction data which needs to be captured is the stoichiometry table of substances used and produced in a reaction. However, these stoichiometry tables are too complicated to incorporate into a LabTrove template, so the LabTrove reaction templates will be used in conjunction with a new ChemSpider jquery widget which is currently in the process of being integrated with LabTrove (more details to follow on this blog shortly!) which will construct them. The widget performs ChemSpider lookups to retrieve compound information, and will calculate equivalents, thereby saving the researcher time when working out the amounts of reactants needed or yields of products obtained. An example of a reaction post which was initially created using the ChemSpider Reactions template and then supplemented by adding a stoichiometry table to it using the ChemSpider Edit Stoichiometry Table widget is shown here.

If you are a LabTrove user and wish to use the ChemSpider templates, their source is available via their links above, and instructions for using templates in Labtrove are documented here.

Hot on the heels of our announcement a few weeks ago, about how we are getting more data into ChemSpider from researchers, we are very pleased to announce that Synthonix have sent us more than a hundred new proton NMR spectra (in addition to the hundreds of spectra that they kindly provided previously). We have now added these spectra to the database. Because the data was provided as real measured spectra and not as an image the spectra are all interactive. Click the image below to be taken to the record containing the original interactive spectrum.


As you can see, the initial view gives a good overview of the spectrum, but it isn’t easy to see the peak splittings. With an image (jpg or pdf) that might be the end of the story – it would certainly be very difficult to get the level of detail that is available in the second zoom image below.


While this post is a good opportunity to say ‘Thank you very much!’ to Synthonix, we would like to encourage researchers or other chemical vendors who might be able to contribute spectra to follow their lead. It’s a great way to raise your profile and build links with potential collaborators or good will with customers. If you are interested in knowing more, please email us directly.


Are you a representative of a chemical vendor? Interested in listing your catalogue on ChemSpider, but not quite sure how to go about it? Want some tips for how to improve your visibility on our site?

We now have a Chemical Vendors information page, where you can find detailed instructions for submitting your catalogue to us. We look forward to receiving your catalogue!

In part one of this series we talked about searching by molecular formula ranges, and combining substructure searches with other types of searches. Part two covered how to search by supplementary information like bioactivity, appearance or melting point. This time we will demonstrate how you can use a search combining these new features to help answer a question you might encounter in the lab.

After performing a bromination reaction on phenol you isolate a product with a melting point of 90-93°C. If you start a search with just three pieces of information – your product is a derivative of phenol, it should contain at least one bromine, and your melting point is 90-93°C – you can construct a search on the Advanced Search page to help you get started in identifying your product.

Advanced Search results - 2,4,6-Tribromophenol

Since you can now combine substructure searches with other searches, you start by looking for a compound containing phenol (Search by SubStructure). To restrict your results to brominated phenols, you add a molecular formula range search for C6H(1-5)O1Br(1-5) (Search by Properties). Lastly, you search for compounds with a melting point of 90-93°C (Search by Supplementary Information).

Advanced Search results - 2,4,6-Tribromophenol

Your search turns up one result – 2,4,6-Tribromophenol. Although you need more information to conclusively confirm the identification, this gives you a lead in your analysis/elucidation.

Taking a look at the record, you may notice it has an interactive IR spectrum from NIST. If you check the Data Sources section, you will find that there are a lot data sources for the record.

Advanced Search results - 2,4,6-Tribromophenol

To make it simpler to identify useful information you can browse the tabs to look for specific types of information: for instance the “Spectral Data” tab provides links to data in the MassBank and NMRShiftDB databases, which will hopefully aid you confirming/determining whether the product is 2,4,6-Tribromophenol.

This is just one example of how you can combine different searches on the Advanced Search page. Advanced searches are a great way to narrow down your results to help you find exactly what you are looking for, and there are many options we haven’t covered here, so have a look around and see what combinations might work for you.

Last time we told you about a number of improvements we have added to ChemSpider in the recent site updates, including combined substructure and properties search and searching by molecular formula ranges. As promised, this time we will cover how to search by properties like melting point or appearance.

Searching by Supplementary Information

Until now, although you could view properties when you were already on a record, there was no way to search by melting point, refractive index, appearance or bioactivity. This update has implemented a new search interface which allows you to search this data. You can now find compounds that are reported as being isolated from yeast, or compounds with a melting point of 32-35 °C.

There are 2 main parts to our Supplementary search interface.

Text Properties Search

Text properties include appearance, chemical class, drug status, or safety data. You can search any of these properties by using key words. When you start typing, a number of suggested search terms will appear, which can help you narrow down what search term to use.

You can also use wild cards by entering *, which can give you a little more flexibility in your search term – so if your unknown is a blue, crystalline material a search for “Blue crystal*” will turn up all records which mention the word “blue”, as well as any word beginning with “crystal” (such as crystals or crystalline).

Searching by Text Properties


Numeric Properties Search

Numeric properties include physical properties like experimental or predicted boiling point, optical rotation, or LogP. Since we draw data from a wide range of data sources, not all of this information is sent to us in the same format or with the units depicted the same way. In order to make it possible for you to search across all the properties in our database no matter how it was supplied to us, we have done a lot of background work on tidying up and standardizing this data.

All numeric properties can be searched using min/max or with a +/- range and the search term can be entered in a variety of units – eg. Fahrenheit or Celsius for temperature, or psi or mmHg for pressure. Because the boiling point of a material is dependent at the pressure at which the measurement is made and not all boiling points are measured at atmospheric pressure we have created a feature that attempts to compensate for this. It uses the Clausius-Clapeyron equation to create estimated (standardised) boiling points for searching, please remember this when looking at your results.

Searching by Numeric Properties


As you can see, you are able to search on a wide variety of experimental properties, including boiling point, LogP, melting point, specific gravity and solubility. Please note that although many of the more common compounds have some properties, these properties are only available on a subset of our records – so if you do not get a result on a property search, it might be that we haven’t added that information yet.

Hopefully this gives you a good idea of the improvements we’ve made to ChemSpider search, and how these new features make it easier than ever to find what you are looking for. See the following post for a case study that showcases several of the new features covered in these posts.

We recently published an update to the ChemSpider website which, in addition to fixing a number of bugs, has added some useful new features. Three of these features are highlighted in this post – one which you might have noticed already, and two which you may not have discovered yet.


We have reinstated the auto-complete feature on the ChemSpider homepage. Now, when you begin typing in the search box, ChemSpider makes suggestions based on what you have typed. This makes it easier than ever to find what you are looking for – even if you aren’t quite sure how to spell it.

Autocomplete on the ChemSpider homepage


Combined Structure/Property Searches

People frequently ask if there is a way to search substructure and other properties like molecular weight or molecular formula at the same time. This update now makes it possible to perform this kind of combined search from our improved Advanced Search page.

E.g. If you are interested in finding compounds which are structurally similar to Valium, you can enter a benzodiazepinone substructure and restrict it to compounds with a molecular weight of 275-325.

Substructure and Molecular Weight search
Substructure and Molecular Weight search

This search then returns Valium along with other similar drugs like clonazepam, nitrazepam and lorazepam.

There are many other search options that can be combined with a substructure/similarity search so look at the Advanced Search page and have a play.

Molecular Formula Range Searching

You can also search a range of molecular formulae at once. To specify the range for a given element, put the range in parentheses after the element. E.g. C7H(10-12)O(0-1) would return all compounds containing exactly 7 carbons and between 10 to 12 hydrogens and which may or may not contain an oxygen. This type of search can be performed from the Simple Search page, as part of an Advanced Search or from the ChemSpider homepage.

Best of all, this can be combined with any of the other search parameters on the Advanced Search page including the substructure search. For example, if you wanted to find polychlorinated biphenyls containing at least three Chlorines you could perform a substructure search for a biphenyl with a molecular formula of C12H(0-7)Cl(3-10).

Substructure and Molecular Weight search
Substructure and Molecular Weight search

In our next post, we will cover some new ways you can search by properties that are stored in our records such as melting point, density, etc.

We have previously described initial steps to integrate ChemSpider with ELNs with IDBS, and to define the elnItemManifest metadata model.

We have now also made further steps to integrate ChemSpider with Southampton University’s ELN, LabTrove, following on from an eScience tool that Stephen Wan from CSIRO had developed with the University of New South Wales to text mine LabTrove ELN blog posts to identify chemical names and link these to the relevant ChemSpider compounds. LabTrove is an open source blog-based system which can be used for recording and sharing experimental findings. Previously, if an image of the compound was to be added to an experiment blog post, it would be necessary either to upload it as an image (following drawing it in a separate drawing package) or to paste in a link to the image in another website (following a separate internet search in another browser window). We have now added the ability to click a button directly when adding or editing an experiment to launch a search of ChemSpider and when the required compound is found, an image of it can be added to chemspider simply by clicking on it, as can be seen in this demonstration video:

The editing controls in LabTrove are based on TinyMCE, a WYSIWYG editor which is used in a range of blogs, including WordPress. This means that this same ChemSpider plugin can also be used to insert compound images from ChemSpider from any other blog or website that uses a TinyMCE editor too.

If you have a LabTrove installation which you would like to add the ChemSpider plugin to then simply update your installation with the latest source code from LabTrove’s SourceForge website.

If you have a website or blog which uses a TinyMCE editor which you would like to add the ChemSpider plugin to then simply download this zip file, extract the folder in it and move the “chemspider” directory created to your tinymce plugins folder. Then, in your tinymce initialization process, add the plugin “chemspider” and the button “chemspider”.

ChemSpider SyntheticPages is one of those projects we support for which I have particular affection. For those who haven’t yet taken a look at it – please do so, it is a community resource made by chemists for chemists and is free to access – you don’t even need to register to look at the articles.

The original concept of SyntheticPages was brought to life by a group of academics who developed the original platform and format (and of course the members of the research community who embraced it and submitted articles). When ChemSpider became part of the RSC the concept of a community resource for reactions seemed like a complementary partner to the database of chemical compounds that we had established. With this in mind we were fortunate to collaborate with the hosts of the original SyntheticPages platform and, combining our resources and visions, we provided a new platform for submission. A short presentation about CSSP is online here.

CSSP today is quite well known within a small community of chemists but comments from the audiences that we expose the work to are very positive on the value of the platform and the way that we have developed it to date. Certainly the authors can get 10s of thousands of hits on their articles based on the published statistics! The “Leaderboards” are all available online for anyone to review.

We believe that everyone can see the value of building a directory of reliable, robust reactions that can continue to evolve through feedback and questions. But more that that, we see the potential benefits for:

  • Young scientists as a portfolio of their work that can enhance a resumé
  • Building systems that can contribute to Alternative Metrics  – Already people are developing platforms, such as Impact Story. CSSP presents the perfect opportunity to build such online contributions will become increasingly visible and important for a scientist in parallel, of course, with the present metrics for contribution and reputation.

We are presently working on a new system for “rewards and recognition” for contributors to our online databases and we will be rolling this out in more detail in the near future. It will be our way of recognizing the contributions of our users for their commitment to communicating science to the community using our platform as one of their vehicles to do so. As part of this activity we are also choosing to recognize present and future authors for their contribution of 5 or more SyntheticPages to CSSP. We will be contacting previous authors to ensure that they receive a brand spanking new, off the press, CSSP Lab coat to thank them for making their syntheses available!

Discussing the project to recognise and celebrate the top contributors to CSSP, Dr James Milne Managing Director RSC Publishing said the following:

“The ChemSpider SyntheticPages lab coats are a great idea, as they highlight a number of fantastic contributors, and also the role of CSSP within the broader publishing context. RSC Publishing strives to serve the needs of researchers worldwide, through publishing and disseminating high quality content, and this database of practical synthetic procedures certainly adds to this knowledge base.  I’d personally like to thank these contributors for supporting CSSP through their publications.”

If you haven’t already qualified for a CSSP lab coat by submitting 5 or more procedures; What’s stopping you? We look forwards to reading your submissions…….


Okay, I’ll admit it, that the title of this entry is not quite what Samuel Taylor Coleridge wrote in The Rime of the Ancient Mariner – but it does sum up this post pretty well.

Image taken from Wikipedia (

Water is one of those chemicals that we tend to take for granted until it reminds us; usually because we have too much or too little of it. In one way or another, water seems to have insistently nagging me this year. In the Spring in the UK there were talks of water restrictions and droughts, while now the many places are flooded, and only a few weeks ago in the US, Hurricane Sandy proved that water could be as formidable a force as the winds.

Don’t forget water is a chemical!

Water has a huge impact on the chemical sciences – after all it is one of the most common chemicals in the world. And as such, Water features in many of the activities of the RSC, to list just a few recent examples….

Well, what about this webinar?

When I was still a bench chemist I have to admit that I only thought of water as something used in extractions, or to be excluded from reactions (and occasionally in tackling the mountain of dirty glassware that I’d accumulated). But looking at the title of the latest Chemistry World Webinar – it looks like there are still many aspects of water that I have to learn about. The webinar is free, if the details below pique your interest; you only need to follow the link and sign up to watch the live Webinar. If you can’t watch at that time or are reading this post after the Webinar has taken place – don’t worry you can access the archive of all of the Chemistry World Webinars at:

The importance of water quality in the laboratory

4 December 2012, 13:00 – 14:00 (GMT)
Free webinar

Speaker: Dr Estelle Riché – Senior Scientist, Merck Millipore

How are water contaminants affecting your lab results?

Join us for our next live and interactive Chemistry World webinar to learn why and how water is purified to yield the various water qualities used in the laboratory.

By the end of this free one-hour knowledge-share, you will be able to:
• identify the different contaminants potentially present in laboratory water
• understand the potential impact of these contaminants on laboratory applications such as HPLC, LC-MS, etc.
• understand how various water purification technologies remove these contaminants from laboratory water
• make better choices for the water you use in your laboratory work
Click here to find out more and register for free
This webinar is brought to you by Chemistry World in partnership with Merck Millipore.


For those of you who were interested by our previous blog post ‘Publish to ChemSpider’ ELN plugin generates elnItemManifest, and are at ACS Fall 2012 in Philadelphia, more details about this project will be described by Dr Simon Coles (Southampton University) in his oral presentation “Towards publishing semantic descriptions of Electronic Laboratory Notebook records” (paper ID: 17061 and final paper number: 90) in the “CINF: Division of Chemical Information division”, and “Herman Skolnik Award Symposium” session, on August 21, 2012 from 10:50 am to 11:05 am at Philadelphia Marriott Downtown, Room: 302/303.

If you are at ACS Fall 2012 don’t forget the other ChemSpider at ACS Fall 2012 in Philadelphia events.

As part of the Royal Society of Chemistry, the ChemSpider team likes to get involved with all of the other projects that are going on within the RSC, and we were really excited to be asked to provide our expertise to the SpectraSchool resource. This HE STEM funded program provides a range of resources to help in the understanding of the principles and practice of spectroscopy and spectroscopic methods.
SpectraSchool brings together Spectroscopy resources, an Introduction to Spectroscopy*, Interactive Spectra and the Spectroscopy in a Suitcase scheme which affords school children the chance to use modern spectroscopic equipment in their classroom.

The SpectraSchool resource was originally developed with the University of Leicester who collected and assigned many of the spectra that displayed within the site. Now that SpectraSchool is part of Learn Chemistry we have helped to integrate new features, including a new HTML 5 based spectrum viewer that provides interactive display of spectra. The fact that this is based on HTML 5 means that the spectrum can be viewed on just about any device that has a modern browser (eg, computers, tablets, phones or even touch screen tvs).

A student visiting the site has the ability to zoom in on peaks and to see which features of a chemical structure give rise to a particular peak in a spectrum; by selecting either a peak or a particular part of the structure (see the highlighting of the methyl group in the structure of caffeine below and the corresponding peak in the adjacent 1H NMR spectrum).

SpectraSchool and Chemistry in the Olympics are great examples of the RSC’s new microsites which bring together lots of great resources and tools in a fresh and exciting interface.

Take a look at SpectraSchool and LearnChemistry today we welome feedback through the in page feedback links or connect with us and other chemistry educators in the Talk Chemistry forums. Why not start exploring this great (free) educational resource today?



* The Introduction to Spectroscopy was developed in collaboration with the University of Cardiff

Once again the RSC will be attending the American Chemical Society’s Fall meeting which will be held in Philadelphia, Pennsylvania, August 19-23, 2012, where the RSC stand will be located at booth 701.

Several members of the ChemSpider team will be attending the conference; both to give presentations and also to chat/answer questions on the booth. If you are attending the conference please drop by and say Hello and ask any questions that you have (you might even be able to get a free coffee – available on a first-come first served basis from 11 am on both Monday and Tuesday). We will also be running an exciting ChemSpider competition to coincide with the conference. You can get details from our Booth #701, or by checking out the ChemSpider blog.

There will be two key ChemSpider events in Philadelphia:
A special On-Stand demo – Monday 20 August, 11 am, Booth 701
“ChemSpider and You: A workshop exploring how ChemSpider can help you find chemical information” – A 2 h workshop for both newcomers to ChemSpider and experienced searchers alike. 10am-12pm Tuesday 21 August, Exhibit Halls A-B, Workshop Room 2 (You can register for the workshop via the conference website – we will try and accommodate anyone who just turns up on the day.)

In addition members of the ChemSpider team are giving a number of talks, including some early glimpses of exciting new tools that we are working on. The presentations are listed below – for more details including the abstracts for each of the talks see the Technical program.

‘Mining public domain data as a basis for drug repurposing’, Philadelphia Marriott Downtown, Room 302/303, Sunday 19th August, 4.15PM – 4.40PM

‘Putting chemistry into the hands of students – chemistry made mobile using resources from the Royal Society of Chemistry’, Pennsylvania Convention Center, Room 109B, Sunday 19th August, 10.50AM – 11.10AM

‘Feeding and consuming data to support Open Notebook Science via the ChemSpider platform’, Philadelphia Marriott Downtown, Conference Room 307, Monday 20th August, 2.05PM – 2.30PM

‘Approaches for extraction and “digital chromatography” of chemical data – a perspective from the RSC’, Hilton Garden Inn Philadelphia, Salon D, Monday 20th August, 2.30PM – 2.55PM

‘Delivering an online service for validating and standardizing chemical structure files using the ChemSpider platform’, Philadelphia Marriott Downtown, Franklin Hall 6, Tuesday 21st August, 9.15AM – 9.35AM

‘ChemSpider compound database as one of the pillars of a semantic web for chemistry’, Philadelphia Marriott Downtown, Grand Ballroom Salon H, Tuesday 21st August, 4.55PM – 5.10PM

‘How can the International Chemical Identifier (InChI) be extended to non-trivial chemicals?’, Philadelphia Marriott Downtown, Franklin Hall 6, Thursday 23rd August, 9:35AM – 9.55AM

‘Serving up and consuming community content for chemists using wikis’, Philadelphia Marriott Downtown, Franklin Hall 6, Thursday 23rd August, 9.55AM – 10.15AM


We look forwards to seeing you at the conference!

ChemSpider has become one of the worlds primary online resources for finding data, information, links, images, spectra..and on and on…about “chemicals”. Building a database of over 28 million chemicals that grows in some way in content, functionality and richness on a daily basis is, to say the least, a lot of work. But our cheminformatics team here at the RSC is not scared of work. We like it! So when we decided that it was time to enhance our efforts around the management of chemical reactions to move from ChemSpider SyntheticPages to a database of chemical reactions containing 10s if not 100s of thousands of reactions the question was how. What software platform would we use? Where would we source reactions? What functionality would we need to roll out as an early display of capability to entice users to test it out, give feedback and, ultimately, get involved. We made those decisions and we will be showing off the results of our project “ChemSpider Reactions Database” (yes, we’re very creative with our project titles aren’t we!!!) at the Fall ACS in Philadelphia.

If you want to learn what we are up to in regards to chemical reactions come and visit with us at the ACS booth…we’ll show an early view of over a quarter of a million reactions in an online, free to access database. We’ll chat about some of our future plans and hopefully engage you in a discussion about whether or not you would be willing to contribute reactions to the database. Wouldn’t it be good if we can provide to synthetic chemists a platform for accessing and managing reactions as we have done for chemicals. Of course, seamlessly integrated and platform independent…served up by the latest web technologies and mobile-enabled. What the future could look like… exciting times!

ChemGoggles? What on earth is ChemGoggles? Is this a pair of safety specs for chemists? No…what would be the fun, and the cheminformatics (!!), in that? ChemGoggles will be shown at the ACS meeting in Philadelphia in a couple of weeks and will be a very early display of our venture into the development of an Android app for “photographing” an image of a chemical and searching the ChemSpider database. It will be a matter of finding an image of a chemical (paper, publication etc), taking a photo using an Android device, using structure recognition software to convert the image to a chemical and then searching ChemSpider. It will be imperfect, an early version, but nevertheless a tantalizing display of some of the new directions we are presently taking at the Cheminformatics group here at RSC.

Chemistry is complex. Anybody who has been involved with the creation of electronic datafiles containing thousands of chemical compounds and associated data (chemical names, properties etc) will tell you that errors creep in. ChemSpider has >28 million unique chemical entities and these have been sourced from many different places/groups/individuals. Some of these have been deprecated as we have determined, both manually and algorithmically, that the data are in error. Over the years we have learned a lot about data quality and ways in which algorithms can be applied to data prior to deposition on ChemSpider.

Some obvious structure-based errors that can be checked for would include: hypervalency (e.g. pentavalent carbons), charge imbalance (a compound has no neutralizing counterion for example), absence of stereochemistry (e.g. a compound with 12 possible stereocenters only has one assigned). There are many other such errors that can be detected algorithmically. It’s the old adage of why apply a human to what a computer can fix. With this in mind we have been working on a system called the ChemSpider Validation and Standardization Platform (CVSP for short). This system will serve multiple purposes. It will be one of the foundation blocks for checking structure-based data for our publications (i.e. catch bad chemistry before it is published!), it will be used for validating chemistry for our databases (Natural Product Updates, Methods in Organic Synthesis and Catalysts and Catalyzed Reactions), it will be used to check and validate depositions going into ChemSpider, it will serve data related to the Open PHACTS project  and it will serve the community by providing an online website where you can upload your own SDF files (and other file formats in future) to validate the structures.

I won’t go into detail here about all of the functionality and capability of the system as we will discuss this in further detail on this blog. However, we will be unveiling the system in its present form at the ACS meeting in Philadelphia. Come along and meet some of the team involved in building CVSP and give us your feedback!

In December 2011 we posted about the ChemSpider plugin for IDBS’s Electronic Lab Notebook (ELN) which described a proof of concept plugin which allows chemical structures which are part of an ELN experiment to be published to ChemSpider. The plugin sent a single sdf file per deposition which contains the chemical structures (in mol format) and very basic metadata information about where it comes from (author, principal investigator, ELN experiment ID) in the associated data fields. A mapping file was set up in the ChemSpider deposition system to process associated data field names in the deposited sdf files from the ELN data source and map them onto internal ChemSpider field names. We would like to extend this initial proof of concept to integrate ChemSpider with more ELNs, to store more advanced metadata with each deposition and to be able to publish more types of ELN data e.g. spectra, reactions and properties. A major step towards this goal would be if the metadata were separated from the data file, were defined by a fixed schema and contained more extensive information (e.g. what is in the accompanying ELN data item, what is its source, and what are its access rights). If it were agreed as a standard ELN vendors and developers could build the ability to generate this metadata into their API’s, to be used either when sharing data to a repository e.g. ChemSpider, or also to exchange data from one ELN to another. We at ChemSpider would develop a deposition webservice to process metadata in this format (and accept depositions from any ELN which generated it). This would make the task of publishing spectra, reactions, chemical properties and other file types from a range of ELNs to ChemSpider much more manageable.

A working group met up on 9th December 2011 to work towards the aim of defining a metadata model to answer the question “What comprises an ELN record or an item in it”. The group was headed up by Dr Simon Coles from the University of Southampton, and comprised representatives from universities, ELN vendors, pharmaceutical companies, and RSC ChemSpider and was a smaller subset of the previous EPSRC Dial-a-Molecule “The Smart Laboratory: Towards a national ELN” meeting. We came up with a top level format for the exchange which describes what’s in the record, how do you get it, who contributed to it and access rights in xml format. Since then Simon and Colin Bird have formalised this format into an xml schema, the details of which will be published shortly in a journal article (in preparation).

Before committing to the development effort that would be required by the ELN vendors and ChemSpider to work towards this ultimate aim, it is necessary to finalise the definition of this schema and verify that it works with an example. As a first step towards this, the ‘Publish to ChemSpider’ IDBS plugin has been modified to generate the metadata that would accompany the mol files of structures in a separate file obeying this schema. In a future phase of work the metadata xml and ELN item would be sent to a ChemSpider webservice to be processed for publishing there. The video and screencaptures below show version 2 of the plugin generating this metadata in action:

And the generated result is as below:
Generated example elnItemManifest metadata.

While every effort was made to populate fields from generic information stored in the ELN system so that this plugin would work with any IDBS installation (not just that of the Chemistry department of the University of Cambridge who kindly allowed the plugin to be developed against their system), this was not possible for all fields since they are not readily available from extension points of the E-WorkBook software – which will need to be addressed if IDBS do develop an API to generate the elnItemManifest. For example, the names and email addresses of the author and principal investigator of the ELN experiment are defined in a configuration file whose settings can be edited via an interface in the ELN software. The license to release the data under, and an embargo period to wait before the data is released publicly are populated by user inputs which are requested when the user chooses to generate the file. The keywords, description and start date of the experiment are populated by customised ELN experiment fields which have been set up only in Cambridge University’s installation of E-WorkBook.

If you have access to a working version of IDBS’s E-WorkBook and would like to install the plugin to work with it please write to and we will be happy to supply it to you.

Again, thanks to IDBS and the Department of Chemistry, University of Cambridge for allowing us to continue development of the ChemSpider plugin against their software and ELN installation respectively.

The eagle-eyed amongst you may have noticed that there was an update to ChemSpider just over a week ago. Many of the changes that were performed on the site were aimed at upgrading the underlying architecture of the site and ensuring that the performance of the ChemSpider site is constantly improving as the number of users of our site and services grows.

Here are a few of the changes to the site that are more visible:

  1. Clearer deprecation of records
  2. Citation details
  3. Visibility of average mass
  4. Layout of the structure search page
  5. Improvements to search messaging
  6. Clearer layout of the Experimental Properties section
  7. Support for foreign language help

So to pick out a few of the key items from the above list….


Clearer deprecation of records

ChemSpider is designed so that by default, deprecated records are not presented in your search results – this ensures that you don’t have to wade through data for records that are clearly wrong or lack any useful data. But, of course there may be occasions where you happen across a deprecated record. In the past, it wasn’t always easy to immediately see that a record had been deprecated and understand the reason that it had been deprecated. In the new design the notification message is far more prominent and we also make it easy to see the reason why the record was deprecated (this is new requirement in the deprecation process and so for older deprecations this field may be blank).


Citation details

We commonly get requests from individuals asking about including data from a ChemSpider record in a presentation or thesis. As outlined in our FAQ page, where individuals reuse data we ask that they cite ChemSpider. And so to make this process simpler we have created an output that contains the basic information that users may need to include in a citation, and we have provided a button that makes it really easy to copy the data to your clipboard in one click.

Looking at the above image you can also see that the Average mass (which was accidentally hidden for a while) has now been made visible the record again.

Layout of the structure search page

One of the most noticeable changes has been the rearrangement of the Structure search interface. While the actual functionality remains the same, the options have been presented in a way that (hopefully) makes it much easier to see all of the options that are available to you when you perform a structure search. This is the 1st phase of our work on this interface, so please let us know what you think about the changes so far.


Clearer layout of the Experimental Properties section

Another significant change that we have made is to the presentation data in the Experimental properties infobox. The data is presented in a tidier layout, and while we have always had the ability to provide links to the original datasource, this was not particularly obvious to some users. In this new design we explicitly display the name of the datasource that provided the data, and wherever possible the name will act as a link back to the relevant page/entry in that datasource.

We hope that you find all of these new features useful, and as always we welcome your feedback on these and any other aspects of the site.

A lot happens in a a few weeks and this past couple of months has been no different. There have been numerous developments for ChemSpider and its related projects including working on the GUI, adding in new data and a lot of infrastructure work on the core of the ChemSPider platform.

We have the ACS meeting in San Diego just around the corner and are presently working hard this week to publish our most recent update to the live servers. For those of you going to San Diego do come and visit us at the RSC booth and we will give you a demo of our most recent project that we have been working on…I’m not going to announce it before the ACS but I encourage any attendees to stop by and hear what we’re up to!

There will be a number of presentations at the meeting and the details are all listed in our online Newsletter.

Alex Tropsha (UNC-Chapel Hill) and I (Antony Williams) will be hosting an InChI Symposium at the meeting so please come along and hear how people are using InChI and some of the directions for the future!

See you in San Diego hopefully!

Recently I have been programming a java plug-in from which I needed to call the ChemSpider webservices, and I found that this wasn’t as straightforward as I was expecting, so I thought I would post how to do it in case it’s useful for anyone else who wants to do likewise.
The basic method I used was to use Apache Axis2 to generate java code for the WSDL’s of the main ChemSpider webservices. This java code is available here: and I have also made the compiled jar file available here: chemspider_webservices.jar. The ChemSpider webservices can be called from other java code by referencing this jar file (and the other axis library files).
This blog post describes how I generated and used this jar file. I was using the Eclipse IDE, so some of what I describe will be specific to that.
There is a similar jar file of some ChemSpider webservices which is available by downloading MZMine (the file chemspider-api.jar in the lib directory) and an example of its use can be seen by downloading the source code and looking at the file src\net\sf\mzmine\modules\peaklistmethods\identification\dbsearch\databases\ That jar file was generated using the previous version of Axis (just plain Axis, rather than Axis2) compared to this one. The example here may be easier to use as a start point since the full range of ChemSpider webservices are included in the jar file, there is a full description of how it was generated, the code used to generate the jar file is available and there are more examples of its use.

Generating the chemspider_webservices.jar file

To generate the java code from the WSDL of the ChemSpider webservices I used the WSDL2Java functionality of Apache Axis2. This is available in different forms, including an Eclipse plug-in which will directly import the java code generated into a project, but I found various bugs when trying to use the latest version of that, so just used the command line version.
I started off with generating the java code from the WSDL of the ChemSpider MassSpecAPI webservice:

  • I downloaded and unzipped the latest version of the Apache Axis2 binary distribution from their download page. I used version 1.6.1 of Axis2.
  • In the “bin” directory of this download there should be a file called java2wsdl.bat. Running this batch file from a command line saves a lot of time trying to set up the class paths correctly to run Java2WSDL. Before using it you should set up the following two environment variables:
    • AXIS2_HOME: Must point to the top level of the AXIS2 files which you just downloaded
    • JAVA_HOME: Must point at your Java Development Kit installation direcotry (e.g. C:\Program Files\Java\jre6)
  • To see a full list of the options available when running WSDL2Java simply open a command prompt and run the batch file with no options to obtain the Usage options – more information about these can be found in the Apache Axis2 user guide:
    • > axis2-1.6.1\bin\wsdl2java.bat
  • I ran it with options to specify to use the SOAP 1.2 port of the ChemSpider MassSpecAPI webservice (most ChemSpider webservices have the option of SOAP 1.1, SOAP 1.2, HTTP GET or HTTP POST), to generate synchronous code only (not asynchronous), and to use adb databinding (this is the default anyway):
    • > axis2-1.6.1\bin\wsdl2java.bat -uri -pn MassSpecAPISoap12 -s -d adb
  • This then generated the file which it automatically put in the package com.chemspider.www (so was the appropriate folder structure was created above it accordingly)
  • I repeated this processes with the other 4 main ChemSpider webservices:
    • > axis2-1.6.1\bin\wsdl2java.bat -uri -pn SearchSoap12 -s -d adb
    • > axis2-1.6.1\bin\wsdl2java.bat -uri -pn InChISoap12 -s -d adb
    • > axis2-1.6.1\bin\wsdl2java.bat -uri -pn SpectraSoap12 -s -d adb
    • > axis2-1.6.1\bin\wsdl2java.bat -uri -pn OpenBabelWebServiceSoap12 -s -d adb
  • The folders and java class files generated by Java2WSDL (,,, and that were generated are available in the zip file for further reference
  • I then started a new Eclipse project, imported this generated File system into it
  • The generated classes rely on the Axis2 library files so these need to be added to the build path – in Eclipse this is done by right-clicking on the project in the Package Explorer, choosing Properties > Java Build Path > Libraries > Add External Jars and selecting all of the lib files in the lib folder of the Axis2 folder.
  • This project was exported as the jar file chemspider_webservices.jar

Using the chemspider_webservices.jar file as an external library jar file

The chemspider_webservices.jar file and all of the Apache Axis2 library jar files need adding to a java project as referenced libraries before it can be called. To do this in Eclipse right-click on the project in the Package Explorer, choose Properties > Java Build Path > Libraries > Add External Jars and select:

  • the chemspider_webservices.jar file (download it from chemspider_webservices.jar and save it locally)
  • all of the lib files in the lib folder of the Axis2 folder.

Once this has been done then the ChemSpider webservices can be called from the project. An example is shown below, and is also downloadable in text format from here. This has been structured into (pretty well self-contained) functions which can be easily called to retrieve the results of a particular operation of a webservice. In the main function these functions are called and the output written out.

Please note that you should put your obtains your own ChemSpider token from ChemSpider to set as the ChemSpiderToken value – to obtain this, register for a ChemSpider account, and look up your token from your user Profile page after logging in. Some tokens require your user account to be associated with the “Service Subscriber” role, which you can request from your user profile page.

package com.chemspider.www.examples;

import java.util.HashMap;
import java.util.Map;

import javax.swing.JOptionPane;

import org.apache.log4j.BasicConfigurator;
import org.apache.log4j.Level;
import org.apache.log4j.Logger;

import com.chemspider.www.*;
import com.chemspider.www.InChIStub.InChIToCSIDResponse;
import com.chemspider.www.SearchStub.GetAsyncSearchResultResponse;
import com.chemspider.www.SearchStub.GetAsyncSearchStatusResponse;
import com.chemspider.www.SearchStub.SimpleSearchResponse;
import com.chemspider.www.MassSpecAPIStub.ArrayOfInt;
import com.chemspider.www.MassSpecAPIStub.ArrayOfString;
import com.chemspider.www.MassSpecAPIStub.ExtendedCompoundInfo;
import com.chemspider.www.MassSpecAPIStub.GetDatabasesResponse;
import com.chemspider.www.MassSpecAPIStub.GetExtendedCompoundInfoArrayResponse;
import com.chemspider.www.MassSpecAPIStub.SearchByMassAsyncResponse;

public class WebServiceExamples {

* @param args

private static final Logger LOG = Logger.getLogger(WebServiceExamples.class.getName());

private static String ChemSpiderToken = "YOU NEED TO INSERT YOUR OWN TOKEN IN HERE";

public static void main(String[] args) {

JOptionPane.showMessageDialog(null, "The compound with InChI InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H has CSID:"+get_InChI_InChIToCSID_Results("InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H"));

int[] SimpleSearchResults = get_Search_SimpleSearch_Results("taxol", ChemSpiderToken);
JOptionPane.showMessageDialog(null, "The first of "+SimpleSearchResults.length+" ChemSpider compound(s) returned by a search for Taxol has CSID:"+SimpleSearchResults[0]);

int[] inputCSIDs = new int[2];
inputCSIDs[0] = 236;
inputCSIDs[1] = 238;
Map> GetExtendedCompoundInfoArrayResults = get_MassSpecAPI_GetExtendedCompoundInfoArray_Results(inputCSIDs, ChemSpiderToken);
Map thisCompoundInfo = GetExtendedCompoundInfoArrayResults.get(238);
JOptionPane.showMessageDialog(null, "The Average Mass of the compound with CSID 238 is: "+thisCompoundInfo.get("AverageMass"));

String[] GetDatabaseResults = get_MassSpecAPI_GetDatabases_Results();
JOptionPane.showMessageDialog(null, "The first of "+GetDatabaseResults.length+" datasources in ChemSpider is:"+GetDatabaseResults[0]);

String SearchByMassAsyncResults = get_MassSpecAPI_SearchByMassAsync_Results(1100.0, 0.1,GetDatabaseResults, ChemSpiderToken);
JOptionPane.showMessageDialog(null, "Transaction ID for search on compounds with mass = 1100+/- 0.1 from any data source is" + SearchByMassAsyncResults);
JOptionPane.showMessageDialog(null, "The operation status of the search with this transaction ID is" + get_Search_GetAsyncSearchStatus_Results(SearchByMassAsyncResults, ChemSpiderToken));
int[] GetAsyncSearchResultResults = get_Search_GetAsyncSearchResult_Results(SearchByMassAsyncResults, ChemSpiderToken);
JOptionPane.showMessageDialog(null, "And the first of "+GetAsyncSearchResultResults.length+" ChemSpider compound(s) returned by the search has CSID:"+GetAsyncSearchResultResults[0]);

* Function to call the InChIToCSID operation of ChemSpider's InChI SOAP 1.2 webservice (
* Convert InChI to ChemSpider ID.
* @param inchi: string representing inchi to search ChemSpider for
* @return: string representing CSID returned
public static String get_InChI_InChIToCSID_Results(String inchi) {
String Output = null;
try {

final InChIStub thisInChIstub = new InChIStub();
com.chemspider.www.InChIStub.InChIToCSID InChIToCSIDInput = new com.chemspider.www.InChIStub.InChIToCSID();
final InChIToCSIDResponse thisInChIToCSIDResponse = thisInChIstub.inChIToCSID(InChIToCSIDInput);
Output = thisInChIToCSIDResponse.getInChIToCSIDResult();
} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;

* Function to call the SimpleSearch operation of ChemSpider's Search SOAP 1.2 webservice (
* Search by Name, SMILES, InChI, InChIKey, etc. Returns a list of found CSIDs (first 100 - please use AsyncSimpleSearch instead if you like to get the full list). Security token is required.
* @param query: String representing search term (can be Name, SMILES, InChI, InChIKey)
* @param token: string containing your user token (listed at your page)
* @return: int[] array containing the ChemSpider IDs. If more than 100 are found then only the first 100 are returned.
public static int[] get_Search_SimpleSearch_Results(String query, String token) {
int[] Output = null;
try {
final SearchStub thisSearchStub = new SearchStub();
com.chemspider.www.SearchStub.SimpleSearch SimpleSearchInput = new com.chemspider.www.SearchStub.SimpleSearch();
final SimpleSearchResponse thisSimpleSearchResponse = thisSearchStub.simpleSearch(SimpleSearchInput);
Output = thisSimpleSearchResponse.getSimpleSearchResult().get_int();
} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;

* Function to call the GetDatabases operation of ChemSpider's MassSpecAPI SOAP 1.2 webservice (
* Get the list of datasources in ChemSpider.
* @return: the list of datasources in ChemSpider as a String Array
public static String[] get_MassSpecAPI_GetDatabases_Results() {
String[] Output = null;
try {

final MassSpecAPIStub thisMassSpecAPIStub = new MassSpecAPIStub();
com.chemspider.www.MassSpecAPIStub.GetDatabases getDatabaseInput = new com.chemspider.www.MassSpecAPIStub.GetDatabases();
final GetDatabasesResponse thisGetDatabasesResponse = thisMassSpecAPIStub.getDatabases(getDatabaseInput);
Output = thisGetDatabasesResponse.getGetDatabasesResult().getString();
} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;

* Function to call the GetExtendedCompoundInfoArray operation of ChemSpider's MassSpecAPI SOAP 1.2 webservice (
* Get array of extended record details by an array of CSIDs. Security token is required.
* @param CSIDs: integer array containing the CSIDs of compounds for which information will be returned
* @param token: string containing your user token (listed at your page)
* @return: a Map> containing the results array for each CSID (with Properties CSID, MF, SMILES, InChIKey, AverageMass, MolecularWeight, MonoisotopicMass, NominalMass, ALogP, XLogP, CommonName)
public static Map> get_MassSpecAPI_GetExtendedCompoundInfoArray_Results(int[] CSIDs, String token) {
Map> Output = new HashMap>();
try {
final MassSpecAPIStub thisMassSpecAPIStub = new MassSpecAPIStub();
ArrayOfInt inputCSIDsArrayofInt = new ArrayOfInt();
com.chemspider.www.MassSpecAPIStub.GetExtendedCompoundInfoArray getGetExtendedCompoundInfoArrayInput = new com.chemspider.www.MassSpecAPIStub.GetExtendedCompoundInfoArray();
final GetExtendedCompoundInfoArrayResponse thisGetExtendedCompoundInfoArrayResponse = thisMassSpecAPIStub.getExtendedCompoundInfoArray(getGetExtendedCompoundInfoArrayInput);
ExtendedCompoundInfo[] thisExtendedCompoundInfo = thisGetExtendedCompoundInfoArrayResponse.getGetExtendedCompoundInfoArrayResult().getExtendedCompoundInfo();
for (int i=0; i Map thisCompoundExtendedCompoundInfoArrayOutput = new HashMap();
thisCompoundExtendedCompoundInfoArrayOutput.put("CSID", Integer.toString(thisExtendedCompoundInfo[i].getCSID()));
thisCompoundExtendedCompoundInfoArrayOutput.put("MF", thisExtendedCompoundInfo[i].getMF());
thisCompoundExtendedCompoundInfoArrayOutput.put("SMILES", thisExtendedCompoundInfo[i].getSMILES());
thisCompoundExtendedCompoundInfoArrayOutput.put("InChI", thisExtendedCompoundInfo[i].getInChI());
thisCompoundExtendedCompoundInfoArrayOutput.put("InChIKey", thisExtendedCompoundInfo[i].getInChIKey());
thisCompoundExtendedCompoundInfoArrayOutput.put("AverageMass", Double.toString(thisExtendedCompoundInfo[i].getAverageMass()));
thisCompoundExtendedCompoundInfoArrayOutput.put("MolecularWeight", Double.toString(thisExtendedCompoundInfo[i].getMolecularWeight()));
thisCompoundExtendedCompoundInfoArrayOutput.put("MonoisotopicMass", Double.toString(thisExtendedCompoundInfo[i].getMonoisotopicMass()));
thisCompoundExtendedCompoundInfoArrayOutput.put("NominalMass", Double.toString(thisExtendedCompoundInfo[i].getNominalMass()));
thisCompoundExtendedCompoundInfoArrayOutput.put("ALogP", Double.toString(thisExtendedCompoundInfo[i].getALogP()));
thisCompoundExtendedCompoundInfoArrayOutput.put("XLogP", Double.toString(thisExtendedCompoundInfo[i].getXLogP()));
thisCompoundExtendedCompoundInfoArrayOutput.put("CommonName", thisExtendedCompoundInfo[i].getCommonName());
Output.put(thisExtendedCompoundInfo[i].getCSID(), thisCompoundExtendedCompoundInfoArrayOutput);

} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;

* Function to call the SearchByMass2 operation of ChemSpider's MassSpecAPI SOAP 1.2 webservice (
* Search ChemSpider by mass +/- range.
* @param Mass: The compounds returned have a mass (Double) within the range Mass +/- Range
* @param Range: The compounds returned have a mass (Double) within the range Mass +/- Range
* @return: the ChemSpider IDs of compounds returned (as a String Array)
public static String get_MassSpecAPI_SearchByMassAsync_Results(Double mass, Double range, String[] dbs, String token) {
String Output = null;
try {
final MassSpecAPIStub thisMassSpecAPIStub = new MassSpecAPIStub();
com.chemspider.www.MassSpecAPIStub.SearchByMassAsync getSearchByMassAsyncInput = new com.chemspider.www.MassSpecAPIStub.SearchByMassAsync();
ArrayOfString inputDBsArrayofString = new ArrayOfString();
final SearchByMassAsyncResponse thisSearchByMassAsyncResponse = thisMassSpecAPIStub.searchByMassAsync(getSearchByMassAsyncInput);
Output = thisSearchByMassAsyncResponse.getSearchByMassAsyncResult();
} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;

* Function to call the GetAsyncSearchStatus operation of ChemSpider's Search SOAP 1.2 webservice (
* Query asynchronous operation status. Requires transaction ID returned by AsynchSearch operation. Security token is required.
* @param rid: String representing transaction ID returned from a previous search
* @param token: string containing your user token (listed at your page)
* @return: String describing status of this search - can have values Unknown or Created or Scheduled or Processing or Suspended or PartialResultReady or ResultReady or Failed or TooManyRecords
public static String get_Search_GetAsyncSearchStatus_Results(String rid, String token) {
String Output = null;
try {
final SearchStub thisSearchStub = new SearchStub();
com.chemspider.www.SearchStub.GetAsyncSearchStatus GetAsyncSearchStatusInput = new com.chemspider.www.SearchStub.GetAsyncSearchStatus();
final GetAsyncSearchStatusResponse thisGetAsyncSearchStatusResponse = thisSearchStub.getAsyncSearchStatus(GetAsyncSearchStatusInput);
Output = thisGetAsyncSearchStatusResponse.getGetAsyncSearchStatusResult().toString();
} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;

* Function to call the GetAsyncSearchResult operation of ChemSpider's Search SOAP 1.2 webservice (
* Returns the list of CSIDs found by AsynchSearch operation. Security token is required.
* @param rid: String representing transaction ID returned from a previous search
* @param token: string containing your user token (listed at your page)
* @return: int[] array containing the ChemSpider IDs.
public static int[] get_Search_GetAsyncSearchResult_Results(String rid, String token) {
int[] Output = null;
try {
final SearchStub thisSearchStub = new SearchStub();
com.chemspider.www.SearchStub.GetAsyncSearchResult GetAsyncSearchResultInput = new com.chemspider.www.SearchStub.GetAsyncSearchResult();
final GetAsyncSearchResultResponse thisGetAsyncSearchResultResponse = thisSearchStub.getAsyncSearchResult(GetAsyncSearchResultInput);
Output = thisGetAsyncSearchResultResponse.getGetAsyncSearchResultResult().get_int();
} catch (Exception e) {
LOG.log(Level.ERROR, "Problem retrieving ChemSpider webservices", e);
return Output;


Disclaimer: I’m new to Java programming, so please excuse me if you are a java expert and I’ve said something obvious, offended you with my code or used the wrong terminology anywhere.

Only two days until the start of this year’s Fall ACS meeting in Denver. The ChemSpider team is busy preparing for the meeting, packing bags, polishing talks and honing workshop skills.

Please drop by and say “Hi!”

We’d like to repeat our invitation to everyone at the conference to drop by the RSC booth (Booth 1100). Where, of course you can chat with the ChemSpider team, get a quick demo (and find out more about our latest features), pick up our hot-off-the-press User Guide or scoop some exclusive ChemSpider goodies!

To celebrate the release of the new iPhone/iPad app* we have a limited number of covers for 3G and 4G iPhones as well as iPads

*The app itself is free to download from the AppStore.

You can also find out about lots of other things that the RSC does: from publishing books and journals to the promotion of chemistry worldwide. We’ll also have lots of information on our new e-membership option, which is making its’ debut at this meeting. Also keep an eye out for members of our Editorial staff from journals including: OBC, MedChemComm, PCCP, Soft Matter and RSC Advances, who will be scouring the conference in search of lots of new and exciting research.

Natural Product & Synthetic Chemists

I’d like to make an extra special invitation to any Synthetic chemists and Natural products chemists – from PhD students to Professors (please pass this on to all your friends and colleagues who will be at the meeting). The ChemSpider team really wants to hear about your research. Tell us about your latest publication or the work that you are most proud of, and we can make sure that your key compounds from these publications are in ChemSpider, on a platform freely accessible to chemists everywhere. If you are more interested in methodology you shouldn’t feel left out – ask us about ChemSpider Synthetic Pages.


ChemSpider related talks and workshops

Antony Williams (most-definitely the hardest working man I know) is giving a number of talks and workshops (details below) which are sure to be entertaining as well as thought-provoking and will be well-worth squeezing into your schedule.

We look forward to meeting you.


“Aligning scientific expertise and passion through a career path in the chemical sciences”

Colorado Convention Center, Room: 110, Sunday 28th August 2011, 1.40PM – 2PM


“Chemistry in the hand: The delivery of structure databases and spectroscopy gaming on mobile devices

Colorado Convention Center, Room: 110, Monday 29th August 2011, 9.05AM – 9.35AM


“ChemSpider: Does community engagement work to build a quality online resource for chemists?”

Colorado Convention Center, Room: 110, Tuesday 30th August, 10.10AM – 10.50AM


“An Introduction to ChemSpider – A Combination Platform of Free Chemistry Database, Free Prediction Engines and Wiki Environment”

Colorado Convention Center, Room 503, Wednesday 31th August 2011, 08.30AM – 11AM


“Structure representations in public chemistry databases: The challenges of validating the chemical structures for 200 top-selling drugs”

Colorado Convention Center, Room: 110, Wednesday 31st August 2011, 10.45AM – 11.05AM

As I mentioned in my blog post a few weeks ago, over the last few months we have been hard at work trying to improve how we organise all of the information and features that can be found when you view a ChemSpider record. And now you can see the fruits of our labour.

We hope that you find the changes we’ve made give you a better and easier user experience. While we think that the changes will be clear and intuitive, I’d like to highlight a few key features in my next few posts.

Inline help

When you look at compound pages and other useful pages, you should now see a lot more Question mark symbols dottedInline Help question symbol throughout the pages. We’ve called this approach inline help: rather than giving you an in-depth help resource on a separate page or as a PDF, it is much more useful to have a little snippet of help right at the point in the page where you need it. Clicking on the question mark symbol should bring up a yellow text box with short guidance (where there is a need to provide more complete help, we’ll provide a link to a page which contains much more detailed information). Of course, do let us know if you have any suggestions for improvements to the help text.

Inline hep text


Default infobox ordering*

Many users indicate they most often look for names (or name-structure associations), physical properties and spectral data, so we have put this information at the beginning of the record. Now when you come to a record, by default the Names infobox is the first box listed followed by the Properties, Spectra and the Articles infoboxes.

None of your favourite infoboxes have been removed (in fact we’ve created some new ones – see later). If you don’t like the default order, it is easy to change the ordering of the infoboxes by clicking on the titlebar and dragging them up or down the record. ChemSpider will remember your order and will use this for all future visits to the site from that PC (in the same browser/profile).

*If you have visited the site before ChemSpider will remember your previous settings. If you want to see the new default order you will need to clear your browser history or delete the ChemSpider cookies that are saved in your profile.


New infoboxes: Searches and Chemical Vendors

ChemSpider has always had great features, for instance:

The Similar Search – that allows you to find records for compounds that have the same skeleton, but have different stereochemistry or isotopic labels

The ability to load the structure from the current record into a structure search, so that you are able to modify it and construct a new search.

However, this hasn’t always been made very clear, in our redesigned compound page we have aimed to make these powerful search tools easier to discover and understand.

The Searches Infobox

Now you can find these all together in the Searches infobox – along with our Google Scholar custom queries which allow you to perform one search across publications using all of the validated synonyms (saving you from having to perform many separate searches for individual synonyms). We also help you to perform ‘structure searches’ of Google (in the form of an InChIkey search).

The Search infobox

The Chemical Vendors Infobox

We’ve also created an infobox  just to display Chemical vendor information, so that it is much quicker to find if the compound in the record is commercially available.

The record for Sparteine with it's Chemical vendors infobox


In my next post I’ll finish off discussing the improvements that we’ve made to the site. But of course, if you have any comments or questions about the features I’ve discussed here, please leave a comment below, or send an email to the ChemSpider inbox.