Following on from our previous blog post about extracting chemical structures (as mol files) from their crystal structures (CIF files) in the RSC archive using OpenBabel, it transpired that the Crystallography Open Database (COD), were conducting a similar project to extract the chemical connectivity (in SMILES format) from their large collection of openly accessible CIF files using OpenBabel. This opened the possibility of linking ChemSpider to COD (and vica-versa) by comparing these SMILES with ChemSpider structures and has resulted in 34,768 new links being made, each with a corresponding CIF in ChemSpider.
ChemSpider-COD linking example
At the beginning of February there were 262,817 CIFs in COD, of which 78,473 had been converted into SMILES (numbers which have been increasing daily since then). We downloaded these SMILES and performed webservice structure searches of ChemSpider on them all using the StructureSearch operation of the ChemSpider Search webservice. Those SMILES which were not currently in ChemSpider were converted into mol files using OpenEye and reviewed by a ChemSpider curator with a view to depositing the suitable structures into ChemSpider as new compounds. The curation meant that we have been able to provide feedback to COD about SMILES that look suspicious and as if there may have been a problem with the conversion process – for example charge and radical issues, undefined stereochemistry for sugars, missing stereochemistry and the duplication of molecules or fragments within the same CIF. Since ChemSpider is primarily a collection of small organic molecules, many of the large number of metallorganic complexes were omitted simply because they weren’t within our scope.
After the deposition of the suitable new compounds, we identified 34,768 ChemSpider compounds which corresponded to COD crystal structures. These links have been added in the “Datasources” infobox under the “Spectral Data” tab, and the corresponding CIF added to ChemSpider so that it will show in the “CIFs” infobox with a link to the relevant COD webpage.
An example compound that has been linked to COD is Ibuprofen (ChemSpider ID 3544) which has been linked to http://www.crystallography.net/2006278.html. The reciprocal links are due to be added to COD shortly.
We would like to thank Miguel Quirós Olozábal (COD) for his help and cooperation with this project.

We are pleased to announce that we have just imported 1047 CIFs to ChemSpider of crystal structures that were previously reported in RSC papers (and are available as ESI for those) to ChemSpider for the relevant compounds, and linked those back to the original articles and to the CCDC’s webCSD, e.g. example compound with RSC article CIF (see the CIF infobox). Since each CIF that is uploaded into ChemSpider must be associated with a ChemSpider compound, the difficult part of this task was working out a 2D molecular structure (in .mol file format) for each 3D crystal structure (in .cif file format) – which is particularly difficult because CIFs only contain information about each atomic position and not how the atoms are bonded to each other in the crystal or whether they are charged or not.
Ultimately we would like this CIF to mol conversion (and the whole upload) to be performed programmatically without human intervention. However, there is no reliable way to do that currently – although programs such as OpenBabel can be used to extract mols from each CIF, the reliability of this conversion isn’t 100%.
So as one of our student intern projects at the University of Southampton this summer (in parallel with another student intern project at Southampton University to share thesis data in ChemSpider) we used OpenBabel (version 2.3.2, run from the command line with the options -i cif inputfilename.txt -o mol -m –unique -d –AddPolarH) to extract mols for all the CIFs in the RSC archive (over 43,000 files as of June 2013) and enlisted Julija Kezina (shown below) to review the results of these conversions to ensure that only good structure and CIF pairs would be deposited to ChemSpider, and to better understand the problems in the conversion process with a view to fixing them. One problem that became immediately apparent was that because the 2D structure obtained was just a projection of the 3D structure along the a cell axis, which is not always the orientation which shows the molecule most clearly, even if they did have the write chemical connections between the atoms, so all mol structures were run through OpenEye’s cleaning algorithm before being reviewed.
Julija Kezina - Southampton University intern who examined CIF to Mol conversion
Julija compared each structure in the output mol files with those in the original CIF files to judge whether the conversion was accurate or not. In addition, as an extra check, all of the output mol structures were submitted to ChemSpider validation and standardisation platform to filter out molecules with structural problems (e.g. stereochemistry, valence or congestion issues).
Overall, approximately 30% of the CIF to mol conversions that Julija checked were good, with the right connectivity of atoms and ions (although approximately 30% of these needed the atomic positions to be repositioned to clean or tidy up the structure, either manually or using ChemDraw’s cleaning functionality). The 1047 of these mols which contain only a single molecule (without solvent molecules or cocrystals etc.) are those which have been deposited into ChemSpider with their corresponding CIFs.
The journals which had the highest successful conversion percentage were Molecular BioSystems (57%), MedChemComm (51%), Organic and Biomolecular Chemistry (44%) and Green Chemistry (44%) – the journals which in general are about small organic molecules.
Julija was working in the National Crystallography Service’s office at the University of Southampton, under the co-supervision of Professor Simon Coles, and we are grateful to them for their help and advice about the finer points of the CIF file format.

Unsuccessful CIF to mol conversions

Running and evaluating OpenBabel on such a large and varied set of structures has given us a useful opportunity to identify and categorise the most common problems encountered. Here we share these and give examples that would enable the identification of some easy fixes in the pipeline that might benefit the whole community and be used as test cases when doing so. We will report these bugs to the OpenBabel forum and because OpenBabel is open source, hope to resolve at least some of these issues in the future through collaboration with its other developers.

The following OpenBabel bugs look like they might be most straightforward to fix:

Details Example
  • Category: BAD_NITRO
  • Frequency: 233
  • Description: there are different ways of representing nitro groups in structure drawers – OpenBabel currently does so by producing a mol with a pentavalent nitrogen. In ChemSpider we we choose to avoid this in favour of a format with a charge-separated nitro.
  • Solution: Allow OpenBabel to have a different output option for nitro groups to output them as shown in corrected mol file.
BAD_NITRO example: ob_b209378b_1.jpg

Files: ob_b209378b_1.zip

  • Category: BAD_MULT
  • Frequency: 434
  • Description: Duplicate (exactly identical, including stereochemistry) molecules are present in the resulting mol file despite running OpenBabel with the –unique option (which should filter out duplicate molecules based on their inchis)
  • Solution: Fix OpenBabel when run with the –unique option so that it works.
BAD_MULT example: nj_b306072a_1.jpg

Files: nj_b306072a_1.zip

  • Category: BAD_MISSINGPARTOFMOLECULE
  • Frequency: 724
  • Description: Part of the molecule is missing
  • Cause: OpenBabel doesn’t understand crystal symmetry – only the atoms in the CIF that are explicitly listed with positions are included in the resulting mol file, and those that are inferred by symmetry are not.
  • Solution: Make OpenBabel generate the full molecule from the symmetry in the CIF file, or recommend that a script/program that can process a CIF to generate another CIF with all atoms is run before OpenBabel.
BAD_MISSINGPARTOFMOLECULE example: ce_b202304k_5

Files: ce_b202304k_5.zip

  • Category: BAD_PARTIALOCCUPANCY
  • Frequency: 432
  • Description: partial occupancy of multiple sites for a particular atom in the CIF file
  • Cause: In CIF files sometimes positions of multiple sites are specified with occupancy less than one – OpenBabel doesn’t recognise this and assumes that the occupancy of all sites is one effectively, so that there are duplicates of some atoms or fragments in the mol file.
  • Solution: Where the _atom_site_occupancy is less than one, group together atoms into those which are alternatives of each other (by type, proximity, and those which add up to a total occupancy of 1) and choose only one of them to include in the final mol file (that with the highest site occupancy, or if two have equal occupancies of e.g. 0.5 then pick one at random). Note that there needs to be consistency, so that if for example a C is discarded, then all of the adjoining H’s with partial occupancy are also discarded but those bonded to the C that is included are included (as in the attached example).
BAD_PARTIALOCCUPANCY example: md_c2md20054f_1.jpg

Files: md_c2md20054f_1.zip

Many of the problems were caused by idiosynchronies or errors in the input CIFs, but these on the whole weren’t handled well by OpenBabel (e.g. by writing an error message and terminating the program) but rather, in the majority of cases went into an infinite loop and the program hung. Because of this, and because the OpenBabel conversions were part of a longer script, all OpenBabel jobs had to be run with an arbitary timeout so that if still running after this timeout they were killed, which may have discarded some valid but long-running OpenBabel jobs. We will investigate whether there is a validation program that can be automatically performed on CIFs to filter out ones with these problems (similar to the CCDC’s EnCIFer but which can be run programmatically), but it would be relatively straightforward to make OpenBabel more reliable by being able to exit nicely when it encounters these problems so that pre-validation wasn’t necessary. These problems are listed in the table below:

Details Example
  • Category: CIF_NOCOORDINATES
  • Frequency: 378
  • Description: cif doesn’t contain any coordinates
  • Cause: Some CIFs contain e.g. powder diffraction refinement data and don’t contain coordinates.
  • Solution: OpenBabel already issues an error: “CIF Error: no atom found ! (in data block:XXX)” – simply abort the program if this is found (rather than trying to continue).

Files: CC_B502254A_3.txt

  • Category: CIF_MISSINGLOOP
  • Frequency: 85
  • Description: cif misses a “loop_” line
  • Solution: Do an initial check that there is at least one loop_ line in the expected place before attempting to do the conversion.
CIF_MISSINGLOOP example: ob_c2ob25400j_2.jpg

Files: ob_c2ob25400j_2.zip

  • Category: CIF_COMMENTEDFIELD
  • Frequency: 36
  • Description: if there is a CIF field name in a commented section of the CIF, OpenBabel doesn’t ignore it and goes into an infinte loop
  • Solution: It would be trivial to make sure that OpenBabel ignores CIF field names which are commented out (between a pair of semicolons).
CIF_COMMENTEDFIELD example: dt_c3dt33040k_1.jpg

Files: dt_c3dt33040k_1.zip

The following OpenBabel bugs were the most frequent in occurence, but will be difficult to fix. They arise from the problem that the CIF format does not record charges on atoms/ions or the types of bong between them so OpenBabel needs to work them out which is hard to do correctly.

Details Example
  • Category: BAD_CHARGEMISSING
  • Frequency: 830
  • Description: One or more ions in the molecule have the wrong charge on them in the resulting mol file
BAD_CHARGEMISSING example: md_c2md20105d_1.jpg

Files: md_c2md20105d_1.zip

  • Category: BAD_WRONGCOORDINATION
  • Frequency: 747
  • Description: One or more atoms or ions in the molecule have the wrong coordination – problem observed in metal ions, S, P, Se and B
BAD_CHARGEMISSING example: ob_b314176d_1.jpg

Files: ob_b314176d_1.zip

  • Category: BAD_BONDMISSING
  • Frequency: 587
  • Description: One or more of the bonds in the molecule are of the wrong order e.g. a single bond instead of a double bond.
BAD_BONDMISSING example: MD_c3md00077j_1.jpg

Files: c3md00077j_1.zip

  • Category: BAD_WRONGBOND
  • Frequency: 452
  • Description: Wrong sequence of single/double bonds.
BAD_WRONGBOND example: nj_b301045g_3.jpg

Files: nj_b301045g_3.zip

  • Category: BAD_NOCOORDL
  • Frequency: 52
  • Description: no coordination to a ligand.
BAD_NOCOORDL example: ob_b307014j_1.jpg

Files: ob_b307014j_1.zip

  • Category: BAD_MISSINGH
  • Frequency: 18
  • Description: missing hydrogen.
BAD_MISSINGH example: ob_b311669g_3.jpg

Files: ob_b311669g_3.zip

There were also some problem mol files produced which either won’t be able to be fixed by OpenBabel (since they resulted from either errors or limitations of the input CIF files which cannot be fixed retrospectively) or are too difficult to fix and/or too infrequently occuring to be worth the effort:

  • There were 237 cases where there were solvent molecules in the CIF (many of which have missing hydrogens, partial occupancy of the molecule or part of the molecule etc.) which give rise to spurious oxygens, fragments of molecules and radicals in the resulting mol file (see example files for nj_b306778e_1.zip). 148 of these cases are just water solvent molecules either with missing or detached hydrogen atoms. The poor definition of the solvent molecules is a limitation of CIF files from diffraction so it is not possible for OpenBabel to better define them in the output mol that is derived from them. However, running OpenBabel with the -r option to remove all but the largest contiguous fragment was quite successful to remove these problem solvent molecules so no further action is required to deal with this problem and this option will be used by us in the future.
  • There were 81 cases where there was at least one missing hydrogen in the original CIF (or in 3 cases, all hydrogens missing) – see example files for ob_B500173K_3.zip.
  • Some CIFs contain crystal structures which correspond to continuous networks rather than small molecules (e.g. polymers, MOFs, zeolites, POMs) which cannot meaningfully be captured in mol format – see example files for ce_b309410c_3.zip.
  • There were a few (24) cases where the stereochemistry in the mol file obtained is incorrectly defined. However, because on the stereochemistry was well interpreted by OpenBabel and these cases were relatively few, it probably isn’t worth disturbing the apple cart to investigate these further – see example files for ob_b407215b_4.zip
  • .

We previously reported an initial proof of concept of an “Insert from ChemSpider” TinyMCE plugin which was integrated with Southampton University’s ELN LabTrove to add compound images from ChemSpider to posts. We are pleased to announce the version 2 of that plugin which now allows a bench chemist who is planning or reporting a reaction to construct a stoichiometry table of the chemicals used and produced in it, as shown at the bottom of this post. Constructing one of these tables manually can be a tedious and error prone task, but now when a LabTrove user is writing up an experiment post about a reaction, they can click on the “Insert from ChemSpider” TinyMCE plugin button in the editor which guides them through the task and retrieves compound properties from ChemSpider so that they don’t need to draw out the compound, or type in its name, molecular weight or formula mass. The amount of each substance can be expressed in a number of different ways – ratio (equivalence), number of moles, mass or volume (depending on the compound state and reaction role), and the user only needs to enter one of these properties and it is automatically inter-converted into the other relevant properties (including calculating equivalents). The product yields are also calculated (as the percentage of the amount actually recorded compared to the amount calculated from the ratio of the product and the amount of the limiting reactant).

Another feature is that it is possible to construct advanced stoichiometry tables which are initially created during a planning stage (during which planned amounts of reactants and products are entered and calculated), but with a separate column to add actual amounts of reactants and products at a later stage.

This functionality was used as part of the intern project to share compound and reaction data in LabTrove and ChemSpider to create an example reaction page in LabTrove. The top of that page was made using the ChemSpider Reactions template, and the table at the bottom with the “Insert from ChemSpider” plugin.

A demonstration video showing how to use this new functionality is shown below:

The TinyMCE plugin relies on the new ChemSpider “Edit Stoichiometry Table” jquery widget which contains all of the functionality behind it. The widget can be used independently of LabTrove, for example in a ChemSpider widget example page, and as such can be easily integrated into different ELNs and websites. We will also be using it in conjunction with the imminent ChemSpider Reactions platform to allow upload of stoichiometry table data to be hosted on there with other reaction data. To allow the widget to be flexible and used by different applications for different purposes, after a stoichiometry table has been created, the widget allows it to be retrieved either as pure html, or a json string (which can be accepted as an input option for the widget to display and edit an existing stoichiometry table) or a html table with the json embedded as a data attribute in it. The latter option allows a stoichiometry table to be added to a webpage with the option to edit it at a later point.

This is the first version of the edit stoichiometry table widget, and it will be used and tested and revised accordingly. Further developments are also planned for the “Insert from ChemSpider” TinyMCE plugin used in LabTrove.

If you would like to to integrate the ChemSpider “Edit Stoichiometry Table” widget with your website, web-based ELN, TinyMCE editor (e.g. WordPress), or LabTrove installation to use and test it and provide us with feedback then please contact us at chemspiderdev@rsc.org.

Output of ChemSpider “Edit Stoichiometry Table” widget: Stoichiometry Table of Substances Used/Produced

Compound Information Substance Information Planned Amounts Actual Amounts
ChemSpider ID: 9162369
Name: 5-Iodo-1-pentene
Formula: C_{5}H_{9}I
MW: 196
compound image
Safety Information:
Role: limiting reactant
State: solution
Source:
Molarity: Moles/L
Concentration: 7.95 g/L
Solvent: THF
Comments:
Ratio: 1.00
Amount: 0.00500 Moles*
Mass: 0.980 g
Volume: 0.123 L
Ratio: 1.00
Amount: 0.00500 Moles*
Mass: 0.980 g
Volume: 0.123 L
ChemSpider ID: 7789
Name: 3,4-Dihydro-2H-pyran
Formula: C_{5}H_{8}O
MW: 84.1
compound image
Safety Information:
Role: reactant
State: solution
Source:
Molarity: 11.79 Moles/L
Concentration: g/L
Solvent: THF
Comments:
Ratio: 2.80
Amount: 0.0140 Moles*
Mass: 1.18 g
Volume: 0.00119 L
Ratio: 2.80
Amount: 0.00750 Moles*
Mass: 0.631 g
Volume: 0.000636 L
ChemSpider ID: 10254347
Name: t-BuLi
Formula: C_{4}H_{9}Li
MW: 64.1
compound image
Safety Information:
Role: reactant
State: solution
Source:
Molarity: 1.7 Moles/L
Concentration: g/L
Solvent:
Comments:
Ratio: 1.60
Amount: 0.00800 Moles*
Mass: 0.512 g
Volume: 0.00471 L
Ratio: 1.60
Amount: 0.00600 Moles*
Mass: 0.384 g
Volume: 0.00353 L
ChemSpider ID: 29341335
Name: 6-(4-Penten-1-yl)-3,4-dihydro-2H-pyran
Formula: C_{10}H_{16}O
MW: 152
compound image
Safety Information:
Role: product
State: liquid
Source:
Purity: 100%
Comments:
Ratio: 1.00 *
Amount: 0.00500 Moles
Mass: 0.761 g
Ratio: 1.00
Amount: 0.00456 Moles*
Mass: 0.694 g
Yield: 91.2 %

(* indicates entered value)

We have just uploaded three new short video tutorials walking you through how to search, comment on, and submit ChemSpider Synthetic Pages procedures.

An introduction to searching ChemSpider Synthetic Pages

Reading and commenting on ChemSpider Synthetic Pages articles

Learn how you can share your work on ChemSpider Synthetic Pages

We welcome any feedback, as well as suggestions for topics for new help videos.

We have just added the most up-to-date database files for ChEMBL and ChEBI to ChemSpider.

Some of the structures from ChEMBL can be found here. Structures from ChEBI may be viewed here. Thank you to the ChEBI and ChEMBL teams for making this data available.

Recently I heard someone who cycled the 1400 km from John O’Groats to Lands End, with a headwind all the way, because it looked on the map as if it was downhill and hence easier. (I am grateful to Neil Swainston of the University of Manchester for this anecdote.)

You might think that “down” on the page is unlikely to be “down” in 3D space, but there is an interesting exception to this, at least for certain interpretations of “down”. Read the rest of this entry »

This summer there have been a number of students from the University of Southampton doing internships on joint projects between the university and the Royal Society of Chemistry and ChemSpider. Three of these students have been sifting through theses from past members of Richard Whitby’s research group in order to extract the compound, spectra and reaction data in it (and linked lab note books, and archive spectra files) and share these in LabTrove, ChemSpider, and CSSP. The students – Alex Hartke, Yet Wai Lee and Josh Whittam (all 2nd year undergraduates) – are shown below together with the boxes of thesis data, lab notebooks and spectra print outs that they digitised.
Southampton University Interns
Between them they digitised 7 theses, by A.Henderson, L. Sayer, D. Owen, D.Macfarlane, F. Giustiniano, G. Saluste, J. Stec, which resulted in 1035 LabTrove pages being published to the Whitby Group’s LabTrove blog.

The theses were a rich source of compound information – including compound structures, names, properties and spectra, all of which were also deposited into ChemSpider resulting in 208 new compound pages, and about 600 spectra.

For this project the students manually deposited the compound information into LabTrove and then deposited the compounds and spectra to ChemSpider. However, we are currently developing a range of ChemSpider jquery widgets which can be integrated into web-based ELNs such as LabTrove which will make it easier to enter compound information from ChemSpider into experiments, and also to publish compound and reaction data from the ELNs to ChemSpider, CSSP and ChemSpider Reactions. This will follow on from the initial proof of concept to retreive ChemSpider information and enter it into LabTrove pages.

With this long-term aim in view, the LabTrove pages that the interns stored the compound and reaction data were structured using LabTrove templates, and this structuring will make it easier for publishing widgets to understand the data and process it the correct way. In this way, the project was partly a test to ensure that the templates were suitable for storing compound data in LabTrove. As well as the ChemSpider compound and associated data template (with corresponding help page, templates were also written to store reaction data in a formatted way, since the theses were primarily focused on the synthesis of compounds. At their simplest, basic reaction data can be stored in LabTrove using the ChemSpider Reactions template (and corresponding help page, and eventually posts written in this format will be easily publishable to ChemSpider Reactions. More detailed reaction data can be stored using the ChemSpider SyntheticPages style reaction template (and corresponding help page. The initial aim was to deposit all of this reaction data into ChemSpider SyntheticPages but it became clear that it was difficult for anyone other than the researcher who conducted the reaction, or their superviser to supply the necessary level of detail for CSSP submissions, and in particular couldn’t easily be reached by retrospectively abstracting theses. As a result, only a handful of reactions were submitted to CSSP, and the majority (over 500) were stored in LabTrove for future submission to ChemSpider Reactions.

If reactions can be published easily from ELNs to ChemSpider Reactions and that is easily queryable by other researchers and their applications when performing new reactions this will be a major step towards the aims of the Dial-a-molecule (an EPSRC Grand Challenge network). An important part of the reaction data which needs to be captured is the stoichiometry table of substances used and produced in a reaction. However, these stoichiometry tables are too complicated to incorporate into a LabTrove template, so the LabTrove reaction templates will be used in conjunction with a new ChemSpider jquery widget which is currently in the process of being integrated with LabTrove (more details to follow on this blog shortly!) which will construct them. The widget performs ChemSpider lookups to retrieve compound information, and will calculate equivalents, thereby saving the researcher time when working out the amounts of reactants needed or yields of products obtained. An example of a reaction post which was initially created using the ChemSpider Reactions template and then supplemented by adding a stoichiometry table to it using the ChemSpider Edit Stoichiometry Table widget is shown here.

If you are a LabTrove user and wish to use the ChemSpider templates, their source is available via their links above, and instructions for using templates in Labtrove are documented here.

Hot on the heels of our announcement a few weeks ago, about how we are getting more data into ChemSpider from researchers, we are very pleased to announce that Synthonix have sent us more than a hundred new proton NMR spectra (in addition to the hundreds of spectra that they kindly provided previously). We have now added these spectra to the database. Because the data was provided as real measured spectra and not as an image the spectra are all interactive. Click the image below to be taken to the record containing the original interactive spectrum.

synthonix-full

As you can see, the initial view gives a good overview of the spectrum, but it isn’t easy to see the peak splittings. With an image (jpg or pdf) that might be the end of the story – it would certainly be very difficult to get the level of detail that is available in the second zoom image below.

synthonix-zoomsynthonix-zoom2


While this post is a good opportunity to say ‘Thank you very much!’ to Synthonix, we would like to encourage researchers or other chemical vendors who might be able to contribute spectra to follow their lead. It’s a great way to raise your profile and build links with potential collaborators or good will with customers. If you are interested in knowing more, please email us directly.

 

Are you a representative of a chemical vendor? Interested in listing your catalogue on ChemSpider, but not quite sure how to go about it? Want some tips for how to improve your visibility on our site?

We now have a Chemical Vendors information page, where you can find detailed instructions for submitting your catalogue to us. We look forward to receiving your catalogue!

The ACS Indianapolis is going to be a very busy week for the RSC as evidenced by the long list of presentations we will be delivering at the conference….come along and see what we are up to…

SUNDAY

1. PRESENTER: Antony Williams

PAPER ID: 11394 PAPER TITLE: Apps and approaches to mobilizing chemistry from the Royal Society of Chemistry SESSION: Chemistry on Tablet Computers DAY & TIME OF PRESENTATION: September 08, 2013 from 8:10 am to 8:40 am LOCATION: Indiana Convention Center, Room: 141

2. PRESENTER: Simon Coles (University of Southampton)

PAPER ID: 13 PAPER TITLE: Tablets in the lab: Enabling the flow of chemical synthesis data into a chemistry repository. SESSION: Chemistry on Tablet Computers DAY & TIME OF PRESENTATION: September 08, 2013 from 10:55 am to 11:25 am LOCATION: Indiana Convention Center, Room: 141

3. PRESENTER: Antony Williams

PAPER ID: 12750 PAPER TITLE: Accessing chemical health and safety data online using Royal Society of Chemistry resources SESSION: New Horizons in Chemical Health and Safety DAY & TIME OF PRESENTATION: September 08, 2013 from 5:35 pm to 5:55 pm LOCATION: Indiana Convention Center, Room: 115

MONDAY

4. PRESENTER: Antony Williams

PAPER ID: 12406 PAPER TITLE: @ChemConnector and my personal experiences in participating in the expanding social networks for science SESSION: Role and Value of Social Networking in Advancing the Chemical Sciences DAY & TIME OF PRESENTATION: September 09, 2013 from 8:20 am to 8:45 am LOCATION: Indiana Convention Center, Room: 141

5. PRESENTER: Bibi Campos-Seijo

PAPER ID: 52 PAPER TITLE: Exploiting the digital landscape to advance the chemical sciences

SESSION: Role and Value of Social Networking in Advancing the Chemical Sciences DAY & TIME OF PRESENTATION: September 09, 2013 from 1:00 pm to 1:25 pm LOCATION: Indiana Convention Center, Room: 141

6. PRESENTER: Valery Tkachenko

PAPER ID:  57 PAPER TITLE: Building support for the semantic web for chemistry at the Royal Society of Chemistry SESSION: Joint CINF-CSA Trust Symposium: Semantic Technologies in Translational Medicine and Drug Discovery DAY & TIME OF PRESENTATION: September 09, 2013 from 1:35 pm to 2:05 pm LOCATION: Indiana Convention Center, Room: 142

7. PRESENTER: Antony Williams

PAPER ID: 14637 PAPER TITLE: Practical semantics in the pharmaceutical industry: The Open PHACTS project SESSION: Joint CINF-CSA Trust Symposium: Semantic Technologies in Translational Medicine and Drug Discovery DAY & TIME OF PRESENTATION: September 09, 2013 from 3:20 pm to 3:50 pm LOCATION: Indiana Convention Center, Room: 142

WEDNESDAY

8. PRESENTER: Antony Williams

PAPER ID: 10738 PAPER TITLE: Social profile of a chemist online: The potential profits of participation SESSION: Before and After Lab: Instructing Students in ‘Non-Chemical’ Research Skills DAY & TIME OF PRESENTATION: September 11, 2013 from 1:35 pm to 2:05 pm LOCATION: Indiana Convention Center, Room: 141

9. PRESENTER: Antony Williams

PAPER ID: 11513 PAPER TITLE: Digitizing documents to provide a public spectroscopy database SESSION: Back to the Future: Print Resources in a Digital World DAY & TIME OF PRESENTATION: September 11, 2013 from 3:15 pm to 3:45 pm LOCATION: Indiana Convention Center, Room: 141

 

THURSDAY

10. PRESENTER: Antony Williams

PAPER ID: 11519 PAPER TITLE: Importance of standards for data exchange and interchange on the Royal Society of Chemistry eScience platforms SESSION: Exchangeable Molecular and Analytical Data Formats and their Importance in Facilitating Data Exchange DAY & TIME OF PRESENTATION: September 12, 2013 from 9:10 am to 9:40 am LOCATION: Indiana Convention Center, Room: 140

11. PRESENTER: Jean-Claude Bradley

PAPER ID:  114 PAPER TITLE: Practical open data exchange formats for open organic chemistry projects SESSION: Exchangeable Molecular and Analytical Data Formats and their Importance in Facilitating Data Exchange DAY & TIME OF PRESENTATION: September 12, 2013 from 10:55 am to 11:25 am LOCATION: Indiana Convention Center, Room: 140

 

 

The RSC eScience Team has always been keen to get more links to literature references and we are currently engaged in work to extract much more information from the wealth of articles that are published in our journals (keep your eyes peeled for more information on this in the future).

The RSC now encourages authors for several of our journals to supply extra information, structures and spectra in their original file formats – which are attached to the article as supplementary information. Already we’ve seen several submissions of data that we have incorporated into ChemSpider records, both enriching the ChemSpider database and also showcasing the research of these authors through their publications. In this way, the RSC hopes to encourage the addition of reusable data files to the research paper as the start of its efforts to promote increased data sharing within chemical science research.

In a few short weeks we’ve received a number of submissions from authors that include key chemical structures as mol files and in some cases extra data including 1H and 13C NMR spectra as well as UV and IR spectra.

We’ve selected a few examples that show how this data not only enriches ChemSpider but, we hope, has benefits to researchers as authors and as consumers of chemical data.

Below are 4 articles for which we received additional supplementary files – the first 2 entries are from submissions where mol files were provided, which allowed us to deposit the structures and associate the article references with the ChemSpider records. The 3rd and 4th entries are examples of submissions where spectra were also provided

10.1039/c3ob40642c  - 26 structures – 4  new to ChemSpider (eg. CSID 28945604)

10.1039/C3OB40745D  – 26 structures – 24 new to ChemSpider (eg. CSID 28941464)

10.1039/C3CC43488E - 9 structures – 7 new to ChemSpider (eg. CSID 16462787CSID 28605621 - both of which have 1H and 13C NMR spectra)

10.1039/C3CC42396D  – 5 new ChemSpider records (eg. CSID 28945607 - Has 1H 13C  NMR and also IR and Raman spectra)

 

A closer look at the data

Taking this last example, let us investigate some of the benefits of supplying these files along with the submission:

1. With the mol file that was supplied we were able to create a new ChemSpider record (CSID 28945607) and then use the DOI of the article to insert the literature reference of the source article.

Feringa-Browne_reference

2. With the spectra files that were supplied we were able to add them to the ChemSpider record as interactve components; we hope that making the spectra interactive makes them easier to use. Lets compare the PDF version of the 1H NMR spectrum with the ChemSpider version – the screenshots below are taken as they appeared on my screen (then scaled to 80% to fit in the blog post). At first glance, they seem similar but. . .

1H NMR Spectrum

1H NMR Spectrum from Supplementary Information PDF

 

Feringa-Browne spectrum in ChemSpider record

1H NMR Spectrum from CSID 28945607


In the ChemSpider record you can use your cursor to easily select an area of interest and instantly see the peaks’ fine structure.

Feringa-Browne_CS_zoom_spectrum

The authors have supplied a very high quality PDF  - so you can zoom in on the PDF to get a better view of the splitting of the peaks, but there are always limitations. In the final image (below) we can see a comparison of the peak centred at 4.23 ppm. In embedded spectrum on ChemSpider you can clearly distinguish the splitting – but looking at the same peak from the PDF we reach the limit of the resolution (this shot was taken when the pdf was scaled to 1200% in the viewer)

Feringa-Browne_Zoom_comp_spectrum

Comparison of the peak centred at 4.23 ppm in ChemSpider and the PDF

 

So by supplying their NMR data as supplementary information it has become easier to discover and use.

3. We can provide links to relevant sources and a comment that can contain extra information above any spectra or CIF files that are displayed in ChemSpider. In this way, rather than your article pointing others to useful data within it, you are using your data to showcase and point back to your article.

 

How can you get involved?

If you are already publishing in the RSC  journals ChemComm, OBC, MedChemComm and Toxicology Research - when you receive the Author Revision email it will contain details about how you can supply extra Supplementary Information. If you have already had your article published (either with the RSC or another publisher) you can email us at chemspider-at-rsc.org and we can add data for you, or alternatively you can register for a ChemSpider account and add your own data at your leisure.

If you have any questions please do leave a comment (or email us directly). We look forwards to hearing from you!

In part one of this series we talked about searching by molecular formula ranges, and combining substructure searches with other types of searches. Part two covered how to search by supplementary information like bioactivity, appearance or melting point. This time we will demonstrate how you can use a search combining these new features to help answer a question you might encounter in the lab.

After performing a bromination reaction on phenol you isolate a product with a melting point of 90-93°C. If you start a search with just three pieces of information – your product is a derivative of phenol, it should contain at least one bromine, and your melting point is 90-93°C – you can construct a search on the Advanced Search page to help you get started in identifying your product.

Advanced Search results - 2,4,6-Tribromophenol


Since you can now combine substructure searches with other searches, you start by looking for a compound containing phenol (Search by SubStructure). To restrict your results to brominated phenols, you add a molecular formula range search for C6H(1-5)O1Br(1-5) (Search by Properties). Lastly, you search for compounds with a melting point of 90-93°C (Search by Supplementary Information).

Advanced Search results - 2,4,6-Tribromophenol


Your search turns up one result – 2,4,6-Tribromophenol. Although you need more information to conclusively confirm the identification, this gives you a lead in your analysis/elucidation.

Taking a look at the record, you may notice it has an interactive IR spectrum from NIST. If you check the Data Sources section, you will find that there are a lot data sources for the record.

Advanced Search results - 2,4,6-Tribromophenol


To make it simpler to identify useful information you can browse the tabs to look for specific types of information: for instance the “Spectral Data” tab provides links to data in the MassBank and NMRShiftDB databases, which will hopefully aid you confirming/determining whether the product is 2,4,6-Tribromophenol.

This is just one example of how you can combine different searches on the Advanced Search page. Advanced searches are a great way to narrow down your results to help you find exactly what you are looking for, and there are many options we haven’t covered here, so have a look around and see what combinations might work for you.

Last time we told you about a number of improvements we have added to ChemSpider in the recent site updates, including combined substructure and properties search and searching by molecular formula ranges. As promised, this time we will cover how to search by properties like melting point or appearance.

Searching by Supplementary Information

Until now, although you could view properties when you were already on a record, there was no way to search by melting point, refractive index, appearance or bioactivity. This update has implemented a new search interface which allows you to search this data. You can now find compounds that are reported as being isolated from yeast, or compounds with a melting point of 32-35 °C.

There are 2 main parts to our Supplementary search interface.

Text Properties Search

Text properties include appearance, chemical class, drug status, or safety data. You can search any of these properties by using key words. When you start typing, a number of suggested search terms will appear, which can help you narrow down what search term to use.

You can also use wild cards by entering *, which can give you a little more flexibility in your search term – so if your unknown is a blue, crystalline material a search for “Blue crystal*” will turn up all records which mention the word “blue”, as well as any word beginning with “crystal” (such as crystals or crystalline).

Searching by Text Properties

 

Numeric Properties Search

Numeric properties include physical properties like experimental or predicted boiling point, optical rotation, or LogP. Since we draw data from a wide range of data sources, not all of this information is sent to us in the same format or with the units depicted the same way. In order to make it possible for you to search across all the properties in our database no matter how it was supplied to us, we have done a lot of background work on tidying up and standardizing this data.

All numeric properties can be searched using min/max or with a +/- range and the search term can be entered in a variety of units – eg. Fahrenheit or Celsius for temperature, or psi or mmHg for pressure. Because the boiling point of a material is dependent at the pressure at which the measurement is made and not all boiling points are measured at atmospheric pressure we have created a feature that attempts to compensate for this. It uses the Clausius-Clapeyron equation to create estimated (standardised) boiling points for searching, please remember this when looking at your results.

Searching by Numeric Properties

 

As you can see, you are able to search on a wide variety of experimental properties, including boiling point, LogP, melting point, specific gravity and solubility. Please note that although many of the more common compounds have some properties, these properties are only available on a subset of our records – so if you do not get a result on a property search, it might be that we haven’t added that information yet.

Hopefully this gives you a good idea of the improvements we’ve made to ChemSpider search, and how these new features make it easier than ever to find what you are looking for. See the following post for a case study that showcases several of the new features covered in these posts.

We recently published an update to the ChemSpider website which, in addition to fixing a number of bugs, has added some useful new features. Three of these features are highlighted in this post – one which you might have noticed already, and two which you may not have discovered yet.

Auto-Complete

We have reinstated the auto-complete feature on the ChemSpider homepage. Now, when you begin typing in the search box, ChemSpider makes suggestions based on what you have typed. This makes it easier than ever to find what you are looking for – even if you aren’t quite sure how to spell it.

Autocomplete on the ChemSpider homepage

 

Combined Structure/Property Searches

People frequently ask if there is a way to search substructure and other properties like molecular weight or molecular formula at the same time. This update now makes it possible to perform this kind of combined search from our improved Advanced Search page.

E.g. If you are interested in finding compounds which are structurally similar to Valium, you can enter a benzodiazepinone substructure and restrict it to compounds with a molecular weight of 275-325.

Substructure and Molecular Weight search
Substructure and Molecular Weight search


This search then returns Valium along with other similar drugs like clonazepam, nitrazepam and lorazepam.

There are many other search options that can be combined with a substructure/similarity search so look at the Advanced Search page and have a play.

Molecular Formula Range Searching

You can also search a range of molecular formulae at once. To specify the range for a given element, put the range in parentheses after the element. E.g. C7H(10-12)O(0-1) would return all compounds containing exactly 7 carbons and between 10 to 12 hydrogens and which may or may not contain an oxygen. This type of search can be performed from the Simple Search page, as part of an Advanced Search or from the ChemSpider homepage.

Best of all, this can be combined with any of the other search parameters on the Advanced Search page including the substructure search. For example, if you wanted to find polychlorinated biphenyls containing at least three Chlorines you could perform a substructure search for a biphenyl with a molecular formula of C12H(0-7)Cl(3-10).

Substructure and Molecular Weight search
Substructure and Molecular Weight search


In our next post, we will cover some new ways you can search by properties that are stored in our records such as melting point, density, etc.

I’ll be talking at the 6th Joint Sheffield Conference on Cheminformatics in July on Validation and Standardization of Molecular Structures in General and Sugars in Particular. This is a taster.

Sugars in Particular

One of the big problems with chemical structure algorithms is that they can’t, in general, cope with the ways that chemists are accustomed to drawing sugar molecules. They will lose the stereochemistry around the sugar ring, collapsing D-glucose, say, on to L-glucose, not to mention allose, altrose, gulose and all the others.

(ChemDraw, I should note, can interpret chair stereo properly, but it is very much an exception.)

The first step in determining correct stereochemistry for a chair atom is recognizing a chair hexagon. That is the subject of this post.

Read the rest of this entry »

We have previously described initial steps to integrate ChemSpider with ELNs with IDBS, and to define the elnItemManifest metadata model.

We have now also made further steps to integrate ChemSpider with Southampton University’s ELN, LabTrove, following on from an eScience tool that Stephen Wan from CSIRO had developed with the University of New South Wales to text mine LabTrove ELN blog posts to identify chemical names and link these to the relevant ChemSpider compounds. LabTrove is an open source blog-based system which can be used for recording and sharing experimental findings. Previously, if an image of the compound was to be added to an experiment blog post, it would be necessary either to upload it as an image (following drawing it in a separate drawing package) or to paste in a link to the image in another website (following a separate internet search in another browser window). We have now added the ability to click a button directly when adding or editing an experiment to launch a search of ChemSpider and when the required compound is found, an image of it can be added to chemspider simply by clicking on it, as can be seen in this demonstration video:

The editing controls in LabTrove are based on TinyMCE, a WYSIWYG editor which is used in a range of blogs, including WordPress. This means that this same ChemSpider plugin can also be used to insert compound images from ChemSpider from any other blog or website that uses a TinyMCE editor too.

If you have a LabTrove installation which you would like to add the ChemSpider plugin to then simply update your installation with the latest source code from LabTrove’s SourceForge website.

If you have a website or blog which uses a TinyMCE editor which you would like to add the ChemSpider plugin to then simply download this zip file, extract the folder in it and move the “chemspider” directory created to your tinymce plugins folder. Then, in your tinymce initialization process, add the plugin “chemspider” and the button “chemspider”.

ChemSpider SyntheticPages is one of those projects we support for which I have particular affection. For those who haven’t yet taken a look at it – please do so, it is a community resource made by chemists for chemists and is free to access – you don’t even need to register to look at the articles.

The original concept of SyntheticPages was brought to life by a group of academics who developed the original platform and format (and of course the members of the research community who embraced it and submitted articles). When ChemSpider became part of the RSC the concept of a community resource for reactions seemed like a complementary partner to the database of chemical compounds that we had established. With this in mind we were fortunate to collaborate with the hosts of the original SyntheticPages platform and, combining our resources and visions, we provided a new platform for submission. A short presentation about CSSP is online here.

CSSP today is quite well known within a small community of chemists but comments from the audiences that we expose the work to are very positive on the value of the platform and the way that we have developed it to date. Certainly the authors can get 10s of thousands of hits on their articles based on the published statistics! The “Leaderboards” are all available online for anyone to review.

We believe that everyone can see the value of building a directory of reliable, robust reactions that can continue to evolve through feedback and questions. But more that that, we see the potential benefits for:

  • Young scientists as a portfolio of their work that can enhance a resumé
  • Building systems that can contribute to Alternative Metrics  – Already people are developing platforms, such as Impact Story. CSSP presents the perfect opportunity to build such online contributions will become increasingly visible and important for a scientist in parallel, of course, with the present metrics for contribution and reputation.

We are presently working on a new system for “rewards and recognition” for contributors to our online databases and we will be rolling this out in more detail in the near future. It will be our way of recognizing the contributions of our users for their commitment to communicating science to the community using our platform as one of their vehicles to do so. As part of this activity we are also choosing to recognize present and future authors for their contribution of 5 or more SyntheticPages to CSSP. We will be contacting previous authors to ensure that they receive a brand spanking new, off the press, CSSP Lab coat to thank them for making their syntheses available!

Discussing the project to recognise and celebrate the top contributors to CSSP, Dr James Milne Managing Director RSC Publishing said the following:

“The ChemSpider SyntheticPages lab coats are a great idea, as they highlight a number of fantastic contributors, and also the role of CSSP within the broader publishing context. RSC Publishing strives to serve the needs of researchers worldwide, through publishing and disseminating high quality content, and this database of practical synthetic procedures certainly adds to this knowledge base.  I’d personally like to thank these contributors for supporting CSSP through their publications.”

If you haven’t already qualified for a CSSP lab coat by submitting 5 or more procedures; What’s stopping you? We look forwards to reading your submissions…….

 

Okay, I’ll admit it, that the title of this entry is not quite what Samuel Taylor Coleridge wrote in The Rime of the Ancient Mariner – but it does sum up this post pretty well.

Image taken from Wikipedia (http://en.wikipedia.org/wiki/File:Plughole.JPG#file)

Water is one of those chemicals that we tend to take for granted until it reminds us; usually because we have too much or too little of it. In one way or another, water seems to have insistently nagging me this year. In the Spring in the UK there were talks of water restrictions and droughts, while now the many places are flooded, and only a few weeks ago in the US, Hurricane Sandy proved that water could be as formidable a force as the winds.

Don’t forget water is a chemical!

Water has a huge impact on the chemical sciences – after all it is one of the most common chemicals in the world. And as such, Water features in many of the activities of the RSC, to list just a few recent examples….

Well, what about this webinar?

When I was still a bench chemist I have to admit that I only thought of water as something used in extractions, or to be excluded from reactions (and occasionally in tackling the mountain of dirty glassware that I’d accumulated). But looking at the title of the latest Chemistry World Webinar – it looks like there are still many aspects of water that I have to learn about. The webinar is free, if the details below pique your interest; you only need to follow the link and sign up to watch the live Webinar. If you can’t watch at that time or are reading this post after the Webinar has taken place – don’t worry you can access the archive of all of the Chemistry World Webinars at: http://chemistryworld.gav.co.uk/webcasts/past-events.php.

The importance of water quality in the laboratory

4 December 2012, 13:00 – 14:00 (GMT)
Free webinar

Speaker: Dr Estelle Riché – Senior Scientist, Merck Millipore

How are water contaminants affecting your lab results?

Join us for our next live and interactive Chemistry World webinar to learn why and how water is purified to yield the various water qualities used in the laboratory.

By the end of this free one-hour knowledge-share, you will be able to:
• identify the different contaminants potentially present in laboratory water
• understand the potential impact of these contaminants on laboratory applications such as HPLC, LC-MS, etc.
• understand how various water purification technologies remove these contaminants from laboratory water
• make better choices for the water you use in your laboratory work
Click here to find out more and register for free
This webinar is brought to you by Chemistry World in partnership with Merck Millipore.

 

(This is not a post about carbohydrates, despite the title!)

Dodgy stereochemistry is a persistent problem.  Even if someone knows all of the stereocentres in a particular molecule, they might not necessarily draw them in a way that a machine, or even a person, can interpret.  There are rules about whether the pointy end or the blunt end of a bond indicates the stereocentre, and it’s surprising how often you see them done wrongly.

Today I’m going to talk about a particular IUPAC recommendation for drawing stereocentres that might at first glance seem surprising, the rule that you may only have one stereobond at a given stereocentre. If you have a wedged bond attached to an atom, you can’t have a hashed bond attached to the same atom. And vice versa.

Why is this?

You might think that as you’re supplying more information, you’re making the diagram easier to interpret. However, you’re running directly counter to the normal principles of communication.  You’re being more informative than required, and this sets off alarm bells in the reader.  What are you trying to say?  If you ask a passerby the time and they say “Well, it’s half past six Greenwich Mean Time” you’re entitled to wonder why they’re quoting the timezone. Maybe they’re trying to be funny.

Paul Grice thought about this whole problem in the 1970s and came up with a set of four principles, summarized in maxims, that listeners (or readers) assume that speakers are following.  These are they:

  • Be Truthful. Do not say what you believe to be false. Do not say that for which you lack adequate evidence.

Let us hope that this one is implicit in any chemical drawing!

  • Make your contribution as informative as is required.  Do not make your contribution more informative than required.

If you have two methyl groups coming off an atom, do not make one wedgy and one hashy. You are adding no new information!

Do not mark carbons with the letter C unless your target audience is schoolchildren.

  • Be relevant:

On the grand scale: do not illustrate an article with any old molecule—make sure the molecule mentioned is actually relevant.

On the scale of the drawing itself, however: If you have three bonds about an ordinary p-block atom, for example, make sure they’re at 120 degrees to each other.  If they aren’t, for example if two of them are at right angles, the reader will infer that something odd is going on.

  • Be clear:

Make sure all your double bonds actually look like double bonds rather than a single bond parallel to another single bond.  I suspect a lot of the success of ChemDraw is down to the fact that it produces attractive, clear chemical drawings.

Do people ever flout the maxims on purpose?

Oh yes.  People often flout the maxims when trying to be funny, or in a political interview.  Similarly there are all kinds of Gricean violations in the chemical drawings you see in patents: bonds which do not quite extend all the way to atoms, R groups labelled as Y (particularly dangerous as Y is yttrium!) or Q or W (also tungsten) or some other unusual letter and so forth.  Exactly why this happens so much more often in patents than in journal articles is left as an exercise for the reader.

Do you know about Natural Product Updates?

Natural Product Updates (NPU) gives you the molecules involved in key developments in natural product chemistry. Thanks to our work in interpreting what chemists mean, not just what chemists draw, ChemSpider now links to NPU’s data for 13800 natural products since 2005, of which 7800 are brand new to ChemSpider.

Where ChemSpider has information on a compound in NPU you will see the image above, as in, for example, calothrixin B. This is a link to NPU page on the RSC Publishing Platform.

Soon we’ll be integrating more of our graphical databases into ChemSpider. Watch this space!

For those of you who were interested by our previous blog post ‘Publish to ChemSpider’ ELN plugin generates elnItemManifest, and are at ACS Fall 2012 in Philadelphia, more details about this project will be described by Dr Simon Coles (Southampton University) in his oral presentation “Towards publishing semantic descriptions of Electronic Laboratory Notebook records” (paper ID: 17061 and final paper number: 90) in the “CINF: Division of Chemical Information division”, and “Herman Skolnik Award Symposium” session, on August 21, 2012 from 10:50 am to 11:05 am at Philadelphia Marriott Downtown, Room: 302/303.

If you are at ACS Fall 2012 don’t forget the other ChemSpider at ACS Fall 2012 in Philadelphia events.

You might not think so, but you’re very good at taking a two-dimensional drawing and converting it into a three-dimensional shape in your head. No, really, you are.

Fig. 1. Galactose in perspective.

Take the drawing of galatose in Fig. 1. Even if you’re not a chemist, you can tell which bits of the ring are at the front and at the back, which bonds point up and which bonds point down. If you actually are a chemist, you’ve been trained to apply this geometrical intuition to work out what’s going on at each of the five stereocentres.

However, if you ask the InChI algorithm about the stereochemistry of this molecule, it’ll say that there is no stereochemistry in there and you’re looking at a stereoless description of which atom is attached to which. Since we use the InChI algorithm to say whether two records describe the same molecule, this puts us in a quandary, and there are thousands of entries in ChemSpider that come from just such a drawing and hence lack stereochemistry.

Read the rest of this entry »

As part of the Royal Society of Chemistry, the ChemSpider team likes to get involved with all of the other projects that are going on within the RSC, and we were really excited to be asked to provide our expertise to the SpectraSchool resource. This HE STEM funded program provides a range of resources to help in the understanding of the principles and practice of spectroscopy and spectroscopic methods.
SpectraSchool brings together Spectroscopy resources, an Introduction to Spectroscopy*, Interactive Spectra and the Spectroscopy in a Suitcase scheme which affords school children the chance to use modern spectroscopic equipment in their classroom.

The SpectraSchool resource was originally developed with the University of Leicester who collected and assigned many of the spectra that displayed within the site. Now that SpectraSchool is part of Learn Chemistry we have helped to integrate new features, including a new HTML 5 based spectrum viewer that provides interactive display of spectra. The fact that this is based on HTML 5 means that the spectrum can be viewed on just about any device that has a modern browser (eg, computers, tablets, phones or even touch screen tvs).

A student visiting the site has the ability to zoom in on peaks and to see which features of a chemical structure give rise to a particular peak in a spectrum; by selecting either a peak or a particular part of the structure (see the highlighting of the methyl group in the structure of caffeine below and the corresponding peak in the adjacent 1H NMR spectrum).

SpectraSchool and Chemistry in the Olympics are great examples of the RSC’s new microsites which bring together lots of great resources and tools in a fresh and exciting interface.

Take a look at SpectraSchool and LearnChemistry today we welome feedback through the in page feedback links or connect with us and other chemistry educators in the Talk Chemistry forums. Why not start exploring this great (free) educational resource today?

 

 

* The Introduction to Spectroscopy was developed in collaboration with the University of Cardiff

Once again the RSC will be attending the American Chemical Society’s Fall meeting which will be held in Philadelphia, Pennsylvania, August 19-23, 2012, where the RSC stand will be located at booth 701.

Several members of the ChemSpider team will be attending the conference; both to give presentations and also to chat/answer questions on the booth. If you are attending the conference please drop by and say Hello and ask any questions that you have (you might even be able to get a free coffee – available on a first-come first served basis from 11 am on both Monday and Tuesday). We will also be running an exciting ChemSpider competition to coincide with the conference. You can get details from our Booth #701, or by checking out the ChemSpider blog.

There will be two key ChemSpider events in Philadelphia:
A special On-Stand demo – Monday 20 August, 11 am, Booth 701
“ChemSpider and You: A workshop exploring how ChemSpider can help you find chemical information” – A 2 h workshop for both newcomers to ChemSpider and experienced searchers alike. 10am-12pm Tuesday 21 August, Exhibit Halls A-B, Workshop Room 2 (You can register for the workshop via the conference website – we will try and accommodate anyone who just turns up on the day.)

In addition members of the ChemSpider team are giving a number of talks, including some early glimpses of exciting new tools that we are working on. The presentations are listed below – for more details including the abstracts for each of the talks see the Technical program.

‘Mining public domain data as a basis for drug repurposing’, Philadelphia Marriott Downtown, Room 302/303, Sunday 19th August, 4.15PM – 4.40PM

‘Putting chemistry into the hands of students – chemistry made mobile using resources from the Royal Society of Chemistry’, Pennsylvania Convention Center, Room 109B, Sunday 19th August, 10.50AM – 11.10AM

‘Feeding and consuming data to support Open Notebook Science via the ChemSpider platform’, Philadelphia Marriott Downtown, Conference Room 307, Monday 20th August, 2.05PM – 2.30PM

‘Approaches for extraction and “digital chromatography” of chemical data – a perspective from the RSC’, Hilton Garden Inn Philadelphia, Salon D, Monday 20th August, 2.30PM – 2.55PM

‘Delivering an online service for validating and standardizing chemical structure files using the ChemSpider platform’, Philadelphia Marriott Downtown, Franklin Hall 6, Tuesday 21st August, 9.15AM – 9.35AM

‘ChemSpider compound database as one of the pillars of a semantic web for chemistry’, Philadelphia Marriott Downtown, Grand Ballroom Salon H, Tuesday 21st August, 4.55PM – 5.10PM

‘How can the International Chemical Identifier (InChI) be extended to non-trivial chemicals?’, Philadelphia Marriott Downtown, Franklin Hall 6, Thursday 23rd August, 9:35AM – 9.55AM

‘Serving up and consuming community content for chemists using wikis’, Philadelphia Marriott Downtown, Franklin Hall 6, Thursday 23rd August, 9.55AM – 10.15AM

 

We look forwards to seeing you at the conference!

ChemSpider has become one of the worlds primary online resources for finding data, information, links, images, spectra..and on and on…about “chemicals”. Building a database of over 28 million chemicals that grows in some way in content, functionality and richness on a daily basis is, to say the least, a lot of work. But our cheminformatics team here at the RSC is not scared of work. We like it! So when we decided that it was time to enhance our efforts around the management of chemical reactions to move from ChemSpider SyntheticPages to a database of chemical reactions containing 10s if not 100s of thousands of reactions the question was how. What software platform would we use? Where would we source reactions? What functionality would we need to roll out as an early display of capability to entice users to test it out, give feedback and, ultimately, get involved. We made those decisions and we will be showing off the results of our project “ChemSpider Reactions Database” (yes, we’re very creative with our project titles aren’t we!!!) at the Fall ACS in Philadelphia.

If you want to learn what we are up to in regards to chemical reactions come and visit with us at the ACS booth…we’ll show an early view of over a quarter of a million reactions in an online, free to access database. We’ll chat about some of our future plans and hopefully engage you in a discussion about whether or not you would be willing to contribute reactions to the database. Wouldn’t it be good if we can provide to synthetic chemists a platform for accessing and managing reactions as we have done for chemicals. Of course, seamlessly integrated and platform independent…served up by the latest web technologies and mobile-enabled. What the future could look like… exciting times!