Hi…ChemSpiderman (Antony Williams) here..I am about to start traveling and I will be giving a presentation next week in the UK. I have been working on some validation of online public domain chemistry databases. In doing this work I realized that what would be of benefit would be to hear from the community what databases you feel can be trusted and to what level. Please visit the online survey and provide me your feedback. This would be very useful for my presentation. If you could do this in the next 48 hours I would be very grateful. Thanks!

Click here to take my survey!!!

A recurring question which has come though our customer usability survey is “Can you copy and paste structures drawn in ChemDraw into ChemSpider?” The answer is yes, you can.  Simply draw the structure as normal and from the Edit menu choose Select All and  Copy. In ChemSpider choose Structure Search from the search menu and click on the structure image to activate one of the Java-based structure drawing applets.

  

 Structure image (3)

 

 

From the options given choose Draw/Edit and paste your structure into the drawing window, followed by Accept. You are now in a position to search your structure.

 

 Structure image (2)

 

 

Alternatively, rather than using this route you can save your structure drawing in ChemDraw as a .mol file and in the structure drawing applet of ChemSpider, select the option to Load, then navigate to the location of your saved .mol file, open and load.

 

 Structure image (1)

 


The ChemSpider web services are intended to allow you to use the functionality of ChemSpider and query the data in it in your own website or program or script. There are many different webservices as described here, and also many different ways to use them.

One example of how to use them was sent to us by Jimmy Moore from the University of Manchester. He includes a call on the SimpleSearch operation of the Search web service in a perl script. THis searches the whole of ChemSpider by an input value which can be the molecule’s name, SMILES string, InChI, InChIKey, and returns the ChemSpider ID:

use strict;
my $unknown = shift;
use SOAP::Lite on_action => sub {sprintf '"%s%s"', @_};
my $token = ' '; # Your token value should be input here. I'm not going to give mine away!
my $service = SOAP::Lite -> uri('http://www.chemspider.com/')
-> proxy('http://www.chemspider.com/Search.asmx');
my $output = $service->call(SOAP::Data->name('SimpleSearch')
-> attr({xmlns => 'http://www.chemspider.com/'})
=> SOAP::Data->name('query')->value($unknown)->type('')
=> SOAP::Data->name('token')->value($token)->type(''));
my @result = $output->valueof('//SimpleSearchResult/int');
print @result;

For further background, and also an example of a perl script which uses the SMILESToInChI operation of the InChI web service see his blog page.

Please note that to use this (and some of the other) web services you need to obtain a token, by registering with ChemSpider (if you have not already), and then logging into ChemSpider and viewing your Profile page. The Security Token shown needs to be copied into the perl script itself in Jimmy’s example.

Also note that you will need to install the SOAP::Lite for Perl modules to your Perl library to run this script if you don’t already.

If you have an example of how you have used the ChemSpider web services then please reply to this ChemSpider forum post. More examples will inspire more new ideas, and also make it easier for other people trying to do similar things.

Are you looking for bioactivity information for small molecules? ChemSpider now provides a direct link to the ChEMBL database from the European Bioinformatics Institute (EBI).

For example, take a look at the record for Fluconazole, an anti-fungal drug, in ChemSpider. If you go to the Associated Data Sources box and select Biological Data you will find the following links:

ChEMBL1

 

 

Clicking on the External ID link associated with ChEMBL will take you to the ChEMBL record.

ChEMBL2

 

 

 

 

The EBI produce both ChEBI (Chemical Entities of Biological Interest) and ChEMBL, a database of approximately 500,000 bioactive compounds. The bioactivities listed are abstracted from the scientific literature and are linked directly to the article.

You could also start your search in ChEMBL and then link back to ChemSpider to find additional information using the Std. InChIKey displayed in the ChEMBL record.

We like to make things easy for our users.

We hope you’ve had an opportunity to take a look at the revamped website. If you would like to share your thoughts on usability and design or site performance please take a moment to click on the “Give Feedback” button on the website. This will really help us to make the ChemSpider user experience even better.

 Kampyle feedback

ALPSP Publishing Innovation award 2010

Some of the team were present at the ALPSP Conference last Thursday – as the envelope was opened to announce ChemSpider as the winner of the ALPSP Publishing Innovation award for 2010! The judging panel commented that “[ChemSpider] has quickly become a highly valued and comprehensive community resource and has immense potential for future development”.

We’re especially proud as we were up against the other excellent shortlisted finalists of DataSalon’s Mastervision (which was highly commended, and we use it ourselves), the Semantic Biochemical Journal from Portland Press and the University of Manchester, and the AIP’s UniPHY social networking site.

We also managed to recreate the prize giving with Antony & Valery this morning – difficult to recreate the atmosphere of a conference dinner at 9am on an autumn Monday morning though…

Pics after the jump

Read the rest of this entry »

My presentation today at the Wolfram Data Summit in Washington DC gave me a chance to rant about the quality of data online and ask the question who really cares? Many of the database hosts don’t appear to care (most don’t respond to emails when I find errors, very few give anyway to annotate an error for example). The talk seemed to be well received and shocked a few people.

For all you Tweeters out there following Science Online the Twitter account for Aileen and Dave at the RSC  is  ChemSpider.

Not to be confused with that of Antony Williams who is still vey much ChemSpiderman.

Nature, Mendeley, and the British Library are excited to present Science Online London 2010. How is the web changing the way we conduct, communicate, share, and evaluate research? How can we employ these trends for the greater good? This September, a brilliant group of scientists, bloggers, web entrepreneurs, and publishers will be meeting for two days to address these very questions.

ChemSpider will be there to hear and record what is being said. If you are going to be there look out for David Sharpe and Aileen Day.

We will of course report back on topics that pertain to ChemSpider and the greater world of chemistry publishing.

Recently I co-authored a publication with Harry Pence for the Journal of Chemical Education. And today the news that it is published online. Please follow the instructions below if you want to be one of the first 50 people to obtain a copy.

“Your article, ChemSpider: An Online Chemical Information Resource, is now available on the Journal of Chemical Education website.  To view your article, please click on the ACS Articles on Request link below:

http://pubs.acs.org/articlesonrequest/AOR-gUuatr3ABq9RiePFMyaK

As part of the ACS Articles on Request e-prints service,  ACS authors may choose to e-mail or post this link on their website to distribute up to 50 free e-prints of their final published article to interested colleagues during the first 12 months of publication. After that 12 month period any author’s article may be accessed without restriction via the same author-directed link that appears above.  The link directs readers to the Full Text version of the article on the ACS Publications website.

Please note: To access the Articles on Request link, please log in to the Publications website using your ACS ID.  If you do not have an ACS ID, you will need to Register for one for free by clicking on “Register” near the top right corner of the website.”

The first circular for the 16th RSC-SCI Medicinal Chemistry Symposium, 11-14 September 2011, Churchill College, Cambridge, UK is now available here.

The Scientifc Program includes:-

Strategies to success – H-PGDS inhibitors for the treatment of inflamatory disorders, Sukanthini Thurauratnam, Sanofi -Aventis

Discovery on next generation glucokinase activators, Mike Waring, AstraZeneca
Inhalation by design, Paul Glosson, Pfizer

Bromodomains a new class of epigenetic targets for small molecule drug discovery, Jason Witherington, GSK

GPCR Structure based drug design using stabilised receptors(StaRs), Miles Congreve, Heptares
GS-9350: a novel pharmacoenhancer, Lianhong Xu, Gilead Sciences

Earlier this month I reported on the integration of Infotherm to ChemSpider but at that time it would have been necessary for non-RSC members to pay for the data on Infotherm despite the fact that a search would have provided the links and you could have clicked through to the Infotherm data pages. Some good news from Fiz-Chemie though…they are waiving the fee for data on pure compounds accessed from ChemSpider and as a result giving access to over 200,000 tables of data. This is a great contribution to the community of ChemSpider users. Thanks Fiz-Chemie!

 

infotherm

Last night I gave a presentation at the BAGIM meeting in Boston. The abstract is below together with the embedded presentation from Slideshare

ChemSpider – Is This The Future of Linked Chemistry on the Internet?
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are now hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of almost 25 million chemical substances, grows daily, and is integrated with over 400 sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for a linked web for chemistry and to provide access to a set online tools and services to support access to these data.

When you’re viewing a compound page in ChemSpider e.g. hydrogen peroxide there are several ways to find more detailed RSC information (articles and books) about that compound:

  1. In the “Articles” infobox, the results under the “Links & References” tab are links to various journals that have either been deposited or added by ChemSpider users. When the RSC enhances one of its articles, the most important compounds in the article are submitted to ChemSpider for deposition (see here for more details), and when this happens, the article details are also deposited so that a link will appear in this box. As such, there usually won’t be thousands of links in this box, but those that are there will for example pick up references to compounds which maybe aren’t named explicitly in the article (but for example are drawn out in a figure) and as such couldn’t be found by a simple text serach.
  2. The results under the “RSC Journals” tab in the “Articles” infobox are the results of passing a search into the RSC publishing platform to retrieve all of the journal articles that contain any approved synonym for the compound (in the “Identifiers” infobox under the “Names and Synonyms” tab). Since these lookups first appeared in ChemSpider 6 months ago (see here for more details) this platform has progressed from bring a beta version, to now being the fully-fledged publishing platform for searching on and delivering RSC journals, books and databases. To investigate the functionality of this platform more and refine your search results list, click on the link above the list of results to “Click here to explore results” and this will allow you to sort the results by date, or apply filters e.g. to restrict the results set by author, date range,  journal etc.). This is useful since for common chemicals, the list of results returned can be long.
  3.  The “RSC Books” tab under “Articles” performs a similar search on occurences of the approved synonyms but on RSC books rather than RSC journal articles.
  4. Likewise, the “RSC Databases” infobox shows search results of the same approved synonyms but this time the results are from the various RSC abstract databases named in its various tabs. This means that they contain references to the compounds in non-RSC articles.

Links to RSC Articles

We deposit a lot of data onto ChemSpider in a  month and the database is growing daily. As an example of the ongoing depositions take a look at what has been deposited in a one month timrframe from July-August. This is simply what has been published by me…not all depositions. It’s a pretty good indicator of ongoing efforts to enhance the quantity of content on the site.

published_in_a_month

The Chemicalize website from ChemAxon is gaining interest (1,2) and, likely, LOTS of users! Chemicalize is both a website for recognizing chemical names and converting to chemical structures as well as an integration path to their property prediction algorithms. Some basic testing of chemicalize shows that their chemical name detection and conversion to structures using either name to structure conversion (algorithmically) or name lookup (via dictionaries) is very good. Not perfect, but very good. Perfect chemical name lookups are impossible as the associated dictionaries grow every time a new natural product is found for example, or a new drug is released.

With ChemSpider we are more interested in the linking to the predicted property pages. For example, if you want to see the predicted properties for Penicillin visit here.

penicillin

Now, with ChemSpider ChemAxon were kind enough (and I mean applaud them, acknowledge them and send flowers!!) to give us a way to pass through a structure and initiate the predictions on the Chemicalize site. This is tremendous news for you all! Under the properties Infobox we provide a list of properties from ACD/Labs, a list of properties from EPISuite, a list of experimental properties, sourced from various places and now, the link to Predict Properties using Chemicalize.

properties

Clicking the tab for Predict Properties from ChemAxon display the link through to Chemicalize as shown below.

chemicalize So, now we have sets of prediction capabilities linked up to ChemSpider. The ACD/Labs predictions are pre-calculated and every time there is an update to the algorithms in theory we would have to recalculate across the database and publish. This would take weeks of time across the almost 25 million structures so it is not a frequent task. It is the same issue with EPISuite. With the Chemicalize integration however the predictions are live, on the structure at the time it is passed to the algorithms. This has the advantage that the prediction algorithms can be incrementally improved and you will always get the latest and greatest results. However, having the predicted values from ACD/Labs available allows flexible searching as shown below. We are grateful to ChemAxon to allowing us to integrate Chemicalize. It gives LIVE access to the latest and greatest predictions as well as access to a whole series of new predictions for which we don’t have data on the database…especially pKa values, topology analysis, geometry and others. Thanks ChemAxon!

acdlabs

It’s a while since we first started the ChemSpider Forum and things have been a little quiet there recently as we made the transition to RSC ChemSpider, so we would like to invite you to re-visit the Forum and share your comments, suggestions and ideas with us.

On the newly revamped ChemSpider website you will find a link under the Help menu that will take you directly to the Forum.

Share some of your user experiences: How do you use ChemSpider? What problems did it solve for you? Did you look for something and didn’t find it?

The Forum will also be a place to find documentation such as Quick Guides on How to do Searching on ChemSpider and How to Add structures, spectra or reactions to ChemSpider so you can help to grow the community for chemistry.

We look forward to hearing from you.

Laying out intrinsically three-dimensional molecular structures in a readable way on a two-dimensional page is a hard problem for human beings, let alone for algorithms, which is why ChemSpider stores a 2D layout alongside the InChI, which only describes which atom is connected to which other atom.

This is a really valuable resource for enhancing our RSC journal articles, so we’ve been experimenting with adding galleries to compounds, with examples here and here. Is this what PDFs you download from the website should look like? Would a digest gallery of the latest articles published be more useful? Do let us know.

As this is my first posting on the ChemSpider blog I should introduce myself. I’m Colin Batchelor and I’m in the Informatics team at the RSC. Some of my work is on ChemSpider, but I also work on informatics for RSC Publishing, and I’m a member of the InChI subcommittee.

PubChem is a very large source of compound structures and data, but the quality and reliability of these can be variable. However, within it, some sets of compounds and substances could be trusted more than most because they’ve been deposited by reliable data sources – for example those deposited by the Nature Publishing Group that correspond to compounds in Nature Chemistry, Nature Communications and Nature Chemistry Biology articles.

We have developed an automated method to search PubChem for substances deposited by the Nature Publishing Group, to extract their structures and properties in sdf format and then import them into ChemSpider. The result is a newly imported set of 5525 molecules in Chemspider. These compounds were deposited in PubChem since 2005 and originate from over 400 articles. All imported compounds link back to the original article – see below.

Example compound from PubChem

The process is automated and can be scheduled to scrape PubChem for newly deposited compounds, and stream these into ChemSpider so this subset will be updated regularly.

This initial prototype could pave the way for other high quality, consistently formatted subsets of PubChem to be identified and deposited into ChemSpider in a similar way. To suggest other possible subsets of PubChem which could be used by ChemSpider join the discussion on the ChemSpider forum.

With one week to go before the American Chemical Society meeting we have unveiled the new ChemSpider website for feedback and comments.We believe that we have made the site easier to navigate, more visually appealing and faster to navigate. The new site map should be very helpful in navigating the site.

We are presently gathering feedback from users with different browsers, updating some of our documentation and generally optimizing performance and navigation across the site. All feedback is welcomed…we’d love to hear from you!

website_redesign

Have you ever had a niggling feeling that you’ve been missing some corner of ChemSpider which might have a tool that will make your life much easier?

http://www.chemspider.com/Sitemap.aspx is the new sitemap for ChemSpider which lists all of the different pages in it and will help you to get an overview of all the different things that you can see and do on ChemSpider.

There are also brief descriptions about each page which will, where necessary, suggest input examples if you just want to try something out but aren’t quite sure what to type into the boxes. If you are a ChemSpider depositor or curator and view the sitemap when logged in you will see additional pages relevant to your assigned roles.

There will be two presentations and a training session about ChemSpider at the ACS meeting in Boston. We hope to see you there if you are attending the meeting. In any case the presentations will be uploaded after the conference onto the SlideShare site.

The presentations are:

Chemistry in your hand: Using mobile devices to access public chemistry compound data, August 26, 2010 1:30 pm, Boston Convention & Exhibition Center, Room: Room 156A

 

How community crowdsourcing and social networking is helping to build a quality online resource for chemists, August 22, 2010 10:25 am, Seaport Hotel, Room: Seaport Ballroom A

The training session is detailed below:

“An Overview and Update of RSC-ChemSpider Capabilities”

Tuesday, August 24, from 3:30-6pm,  Boston Convention and Exhibition Center, Room 102B

The Fall American Chemical Society meeting in Boston is just around the corner and we are finishing up the integration of some recent developments. As is usual we have way more things that we wanted to deliver than we have been able to implement. But we’ve always got more ideas so this is no surprise. The biggest change is the new website redesign that a number of readers of this blog voted on and influenced. In the next few days I will announce new features and integrations that we will be delivering, all being well!

Pillbox is defined on the website as “…enables rapid identification of unknown solid-dosage medications (tablets/capsules) based on physical characteristics and high-resolution images.” I first heard about PillBox when David Hale, “host” of Pillbox, and I were on the same speaking agenda at a meeting in Washington. David is a very engaging speaker and I really appreciated the visual nature of what they are working to deliver. Trying to identify the pharmaceutical ingredients in a pill from the color and pill imprint etc is tough…then comes PillBox.

Using the API provided to PillBox we have integrated ChemSpider directly to PillBox. So, in the future when you search on Viagra on ChemSpider and find Sildenafil Citrate you will see a new link to Pillbox (we are determining where on the page at present) and under there you will see the links to Pillbox.

A search on Viagra on Pillbox will show this:

pillbox1

but using the API we can show the information embedded into the ChemSpider page as show below.

pillbox2

Notice that all three Viagra dosages are shown and each can be opened to preview one at a time. 

This is simply one more example of integrating into the online resources becoming increasingly available and focusing on having ChemSpider being a structure-searchable hub connecting them together.

iChemLabs, the developer of the popular ChemDoodle chemical drawing program, and Royal Society of Chemistry’s ChemSpider, a leading provider of chemical services and data on the internet, are excited to announce a software agreement that will provide significant benefits to customers.

iChemLabs, a developer of chemical software for students and professionals, announces a strategic partnership with RSC ChemSpider. iChemLabs has integrated ChemDoodle with the ChemSpider database containing almost 25 million compounds to search for pre-drawn chemical structures through the innovative MolGrabber widget. ChemSpider is integrating ChemDoodle Web Components into their service to provide users with a next generation HTML5 experience.

Search ChemSpider with ChemDoodle

Kevin Theisen, President of iChemLabs, states “ChemSpider is an excellent example of the creation of a popular and useful service by utilizing cutting-edge technology. By incorporating the HTML5 ChemDoodle Web Components, ChemSpider will take a further step towards creating the most advanced and futuristic chemical database on the web. Now that ChemDoodle Web Components are fully supported on iPhone OS and Android, our partners will be able to push rich media services to their customers across all browsers and mobile devices.”

Antony Williams, VP of Strategic Development for ChemSpider, adds “For many chemists ChemSpider has become their primary website to search for chemicals and related information, whether it be through a standard browser or via a mobile device. Our intention is to offer the best user experience possible and integrating to the HTML5-compliant ChemDoodle Web Components will afford users enhanced capabilities.”

ChemDoodle is available for download immediately. New users can request a free 30 day trial at http://www.chemdoodle.com. The free and open source ChemDoodle Web Components can be accessed at http://web.chemdoodle.com. ChemSpider is hosted at http://www.chemspider.com.

About iChemLabs, LLC.:
iChemLabs, LLC. is a scientific software company specializing in all forms of computational chemistry including NMR simulation, chemical visualization, and chemical informatics. iChemLabs provides expertise in desktop, mobile and web based technologies for both consulting and custom development. www.ichemlabs.com

About the Royal Society of Chemistry:
The RSC is the largest organisation in Europe for advancing the chemical sciences. Supported by a worldwide network of members and an international publishing business, our activities span education, conferences, science policy and the promotion of chemistry to the public. www.rsc.org

About ChemSpider:
ChemSpider offers a structure centric community for chemists to resource data. Offering access to almost 25 million unique chemical entities from over 400 data sources and by providing a platform for crowd sourced deposition, annotation and curation, it is the richest source of free integrated chemistry information available online. ChemSpider delivers data and services to enable the semantic web for chemistry. www.chemspider.com

Name Kevin Theisen
Phone 888-505-2436
Email sales@ichemlabs.com
Url http://www.ichemlabs.com
Address 200 Centennnial Ave., Suite 200
City/Town Piscataway
State/Province NJ
Zip Code 08854
Country USA
   
Name Antony Williams
Phone 919-201-1516
Email info@chemspider.com
Url http://www.chemspider.com
Address 904 Tamaras Circle
City/Town Wake Forest
State/Province NC
Zip Code 27587

The INFOTHERM® database, produced by FIZ CHEMIE  Berlin, has now been linked into records in ChemSpider.

The database provides experimental thermodynamic and physical properties of 33,000 mixtures and 9,000 pure substances from a total of 12,000 compounds. In ChemSpider if you search for a compound and look under the Phys. Properties tab in the Associated Data Source Info box, you will find a link to a record in INFOTHERM.

ChemSpider record for Artemisinin showing INFOTHERM link

ChemSpider record for Artemisinin showing INFOTHERM link

Clicking on the ID link will take you to records in INFOTHERM for that compound where you can look at the various thermodynamic properties available and make a selection of which property is of interest. By clicking on the button for Full access you can view the experimental data and find a link to the bibliographic reference.

 

Artemisinin's solubility in supercritical carbon dioxide

Artemisinin's solubility in supercritical carbon dioxide