Archive for the InChI Category

For some time now it has been possible to access relevant SureChem patent information from a ChemSpider compound page in the Patents Infobox. ChemSpider compounds are also linked to and from the relevant RSC articles, which has allowed us to form a new partnership between RSC Publishing and SureChem which relies on ChemSpider taking the pivotal role of linking internet chemistry together.

In the RSC article landing pages there is a “Compounds” tab which shows the key compounds that the article is about – as shown in this example. For each compound there is now a link to view the SureChem patent information associated with that compound as below:

The RSC Publishing platform article landing page showing SureChem patent information

The RSC Publishing platform article landing page showing SureChem patent information

SureChem and SureChem’s new free offering, SureChemOpen, offer a suite of patent chemistry data solutions, for example allowing their patents to be found from a structure or substructure search. Now, for each compound returned from such a search it is possible to view any linked ChemSpider compound pages and the number of associated RSC publications (and follow a link to view these articles).

This linking between SureChem and the RSC publication platform relies on ChemSpider (and the standard InChI chemical identifier) providing a bridging link to both, which ensures that the system is accessible, standards-based and scalable, making it easy for future partners to join.

We’ve rejigged our data to make searching more reliable.

What have we done?

We’ve regenerated all of the InChIs in the database with version 1.03 of the InChI code.

What does that mean?

The InChI (international chemical identifier) is a short piece of text that describes the structure of a molecule. Each one is generated by a free and open-source computer program, which guarantees that it should be the same and there shouldn’t be conflicting InChIs for the same molecule. You can’t really write them by hand, because they look like this:

InChI=1S/C10H22ClN2O5PS/c1-3-10(9-18-20(2,15)16)12-19(14)13(7-5-11)6-4-8-17-19/h10,12H,3-9H2,1-2H3

ChemSpider is built on InChIs. If two molecules have the same InChI, then they’re the same record in ChemSpider, and if you can’t InChIfy it, you can’t put it in ChemSpider. That’s why we can’t do, for example, polymers yet.

We’re proud to be founder members of the InChI Trust, which supports this critical element in the sharing of chemical compound information.

InChI Trust Member 2011

What does all this mean for ChemSpider?

Because there is an active community supporting InChI who look out for these things, version 1.03 contained some bug fixes which mean that a very small number of the InChIs themselves, only a few dozen out of the whole database, have changed.

  • P+–O bonds and P+–S are now treated slightly differently. This means that it will be easier to find the exact molecule you’re looking for, regardless of how it’s been drawn. (In principle this will also apply to analogous bonds containing arsenic, selenium, tellurium and antimony, but I can’t see any examples of this in the database.)
  • There was a small bug where the InChI generated for a molecule with an azide group in it sometimes varied according to the input drawing. But that doesn’t happen now.

This regeneration has also allowed us to catch and clean up some errors in the data.

What happens next?

Version 1.04 of the InChI code will be released soon. With our new framework for processing large amounts of data we’ll be able to update our InChIs much quicker. The main changes in 1.04 that affect the InChI are to how it handles radical atoms in aromatic rings, nobelium, lawrencium and rutherfordium, so we anticipate that there shouldn’t be very many changed InChIs!

It was a busy week at the ACS meeting in Washington. I gave three presentations and the title, abstracts and links to Slideshare are given below:

Oops and Downs of Resolving InChIs For the Chemistry Community (Link to Slideshare)

The InChI resolver was rolled out to the community in March 2009 with the purpose of providing a centralized resource for chemists to resolve InChIs (International Chemical Identifiers). This presentation will provide an overview of the development of the underlying technologies associated with the InChI resolver, and how the resolver is being used, integrated and enhanced to provide additional value to the chemistry community. We will discuss present limitations to application of the resolver for providing access to databases and chemistry information distributed across the internet and define our vision for enhancing interconnectivity across Open databases using the InChI resolver as the glue.

ChemSpider: Building a knowledge-based community for chemists using social and data networking technologies (Link to Slideshare)

In less than 2 years ChemSpider has become one of the primary online resources for chemists providing access to an unsurpassed aggregate of free-access knowledge and data. ChemSpider was developed with the intention of providing a structure centric community for chemists that would be enhanced by data depositions, curations and annotations by the community. The system presently hosts over 21.5 million chemical compounds from over 200 data sources. Working with a network of advisors, collaborators and data providers ChemSpider has created a unique resource of integrated information for chemists. These efforts have enabled us to support the curation of the Wikipedia chemistry pages, the production of a community supported Open Access chemistry journal and provision of web services integrated to spectrometer systems distributed around the world. This talk will provide an overview of how ChemSpider utilized social and data networking to create a community for chemistry.

Building an integrated system for chemistry markup and online publishing integrated to online chemistry resources (Link to Slideshare)

The extraction of chemical entities from documents such as patents and publications has been pursued for a number of years. We wish to report on ChemMantis, an integrated system for chemistry-based entity extraction and document mark-up enabling access to the rich resource of online chemistry know as ChemSpider. We will discuss the development of the platform from its inception as a series of dictionaries to the integration of an entity extraction algorithm and its expansion to a public deposition and publishing platform for chemistry. Chemistry articles can now be deposited, marked-up and exposed to the public within a few minutes in many cases making it an ideal platform for communicating research and providing integrated access to data sources including PubChem, ChEBI, Wikipedia and Entrez.

Reblog this post [with Zemanta]

Egon Willighagen has been growing the Linked Open Chemistry Data with his work on rdf.openmolecules.net. He has now integrated to the InChI Resolver to enhance the integration as shown below. We’re looking forward to hearing from users benefiting from this!

OpenMolecules RDF

About http://rdf.openmolecules.net/?InChI=1/CH4/h1H4
Identifier info:inchi/InChI=1/CH4/h1H4
InChI InChI=1/CH4/h1H4
Source Chemical blogspace
Source ChEBI
ChEBI ID CHEBI:16183
owl:sameAs http://bio2rdf.org/chebi:16183
Source Connotea
Tag NewTag
Tag alkanes
Tag Gas
Tag InChI
Source DBPedia
owl:sameAs http://dbpedia.org/resource/Methane
Source NMRShiftDB
owl:sameAs http://pele.farmbio.uu.se/nmrshiftdb/?moleculeId=20029286
NMRShiftDB mol ID 20029286
Source ChemSpider
ChemSpider ID 291

RDF Resource Description Framework Powered Icon

Reblog this post [with Zemanta]

inchis_rscIn what seems like an eon since I first blogged about the need for an InChI Resolver ChemSpider has continued its efforts to provide valuable resources for chemists while benefiting from the advantages of InChI and working through many associated challenges. I will give a presentation tomorrow at the ACS Meeting here in Salt Lake City (and a gorgeous place it is!) in a session dedicated specifically to the InChI identifier and its increasing penetration into the world of Cheminformatics, publishing and internet Chemistry. The talk will be posted to SlideShare here as usual.

 

Following the declaration of the need for an InChI Resolver I discussed the project with a number of groups (five in total) and wrote up project descriptions and hypothetical timelines to deliver a resolver. We finally announced a joint project with the Royal Society of Chemistry on December 1st 2008 and started work on producing a beta release version of the resolver by ACS Spring 2009..that would be TODAY. The alpha release went live about 4 weeks ago and logins were provided to a number of interested parties. From all of the people who tested the system we received a couple of bug reports and small requests for enhancement and all of those changes have been implemented just in time to release the Resolver for general public consumption here at the ACS.

 

We already have a list of things we want to deliver to enhance the system but will be waiting for feedback from the community regarding the value and workflows associated with this system as it functions presently in Beta release. An overview about the system is available here in Powerpoint and shown below. Go try it out at inchis.chemspider.com. It is in BETA release so send us any feedback please to info@chemspider.com. Thanks!