We are presently running ChemSpider SyntheticPages Beta off of our servers in Washington DC. Last week the servers were taken offline for a few hours after the 30″ snowfall felled some power lines in the city. Our apologies. There is another storm due to hit Washington this week and an expected 20″ of snowfall. It is possible that our beta servers will go offline again. If that happens we apologize in advance. We are presently configuring our beta servers to run out of the RSC offices in Cambridge in the United Kingdom. When that happens we will be less susceptible to such power issues.

Buy me a Coffee

I was sitting down today to review what presentations are coming up in the next few weeks and how much writing and travel was ahead of me. Ugh. Painful. During the next few weeks of conference season there will be a lot of talks and, as usual, a lot of late nights before the presentations to write new talks or modify existing talks. I will be at the ACS meeting in San Francisco this spring and will be giving four presentations, a poster and leading a training session on ChemSpider. The presentations are outlined below. Looking forward to seeing you there and it would be great to hear from any of you who would like to get together and connect about community chemistry over a coffee.

Presentation: Utilizing ChemSpider as a platform for education and exposure of student data to the community.

Educators and students now have access to rich internet resources of information. RSC’s ChemSpider is a community resource of structure-based chemistry delivering data including chemical compound collections, reaction synthesis procedures, physicochemical property and various forms of spectral data. ChemSpider offers the opportunity for the community to participate in populating, annotating and curating the data on ChemSpider. We believe that ChemSpider offers an opportunity for educators and students to participate in the ongoing development of a rich resource for the chemistry community. This presentation will suggest some potential uses of the ChemSpider website in terms of integrating into lesson plans. We will also outline how students can expose their structure and reaction-based research work via the ChemSpider platform for the benefit of the community and their online scientific reputation.

Presentation: ChemSpider – How An Online Resource of Chemical Compounds, Reaction Syntheses, and Property Data Can Support Green Chemistry

ChemSpider is an online database containing in excess of 20 million chemical compounds and associated experimental and predicted physicochemical data, reaction synthesis details and analytical data. A significant amount of the data contained within the database has been harvested and collated from a number of inventory systems and integrated to provide a centralized resource for the community. The ChemSpider database has the added benefit of being available for community deposition, annotation and curation. As a result it offers the potential for researchers to share their latest research with the public and participate in the creation of a rich resource of chemistry related information for the Green Chemistry community. This presentation will provide an overview of present capabilities and discuss the future vision for the platform.

Presentation: ChemSpider, how a free community resource of data can support teaching NMR spectroscopy

ChemSpider is an online database of chemical compounds, reaction syntheses and analytical data. Provided by the Royal Society of Chemistry, our intention is to provide a free internet resource of chemistry related data for the community. ChemSpider is unique in its role of allowing user depositions of chemical structures, synthesis procedures and analytical data and, in so doing, provides an environment for crowdsourced gathering of information. To date over 2000 1D and 2D NMR spectra have been deposited online by the community and are available for reuse. The data have been used as the basis of a spectral game whereby students can learn NMR by interacting with the data. This presentation will provide an overview of the tools and capabilities presently available on ChemSpider to support teaching NMR in the undergraduate curriculum and will outline how the community can participate in enriching this resource for the benefit of all.

Presentation: Enhancing discoverability across Royal Society of Chemistry content by integrating to ChemSpider, an online database of chemical structures

The ability to query across a chemistry publishers content using chemical structure searching can dramatically enhance discoverability. RSC has been applying a number of procedures to integrate RSC’s ChemSpider community resource with our published content and databases. These include: 1) entity extraction procedures 2) chemical name conversion procedures using software algorithms and curated dictionaries 3) semantic markup and 4) a crowdsourced curation processes. This presentation will provide an overview of the processes we have utilized in order to provide structure-based integration to RSC content. We will discuss our ongoing efforts to extend the approaches to the mining of data from the rich supplementary information sections of many RSC publications. Our intention is to provide access to synthesis procedures and analytical data and further enrich the ChemSpider database for the benefit of the chemistry community.

Poster: Utilizing ChemSpider as a platform for education and exposure of student data to the community

Buy me a Coffee

Recently I announced the release of ChemSpider  SyntheticPages. We are honored to have an editorial board of chemists to assist in directing the project and they are introduced below:

  • Kevin Booker-Milburn

    Kevin Booker-Milburn is a Professor of Synthetic Chemistry in the School of Chemistry at the University of Bristol, UK. He has 20 years research experience in broad aspects of synthetic chemistry and in recent years has focused on the development of new synthetic methods for use in the total synthesis of natural products such as terpenes and alkaloids; specifically developing and applying novel photochemical and transition metal techniques. He is Director of the Bristol Chemical Synthesis Doctoral Training Centre, an EPSRC and Industry funded initiative which has a bold vision to train a new generation of researchers for the chemical industry and academe.

  • Jean-Claude Bradley

    Jean-Claude Bradley is an Associate Professor of Chemistry at Drexel University. He leads the UsefulChem project, an initiative started in the summer of 2005 to make the scientific process as transparent as possible by publishing all research work in real time to a collection of public blogs, wikis and other web pages. Jean-Claude coined the term Open Notebook Science (ONS) to distinguish this approach from other more restricted forms of Open Science. Jean-Claude has a Ph.D. in organic chemistry and has published articles and obtained patents in the areas of synthetic and mechanistic chemistry, gene therapy, nanotechnology and scientific knowledge management.

  • Stephen Caddick

    Stephen Caddick is a Professor of Organic Chemistry and Chemical Biology and Head of Department of Chemistry at UCL. He was previously at the University of Sussex (1993 – 2003). His research interests include Organic Synthesis and Synthetic Methodology, Chemical Biology and Structural Biology and Catalysis.

  • Peter Scott

    Peter Scott is a Professor of Chemistry at the University of Warwick, UK, and was formerly at the University of Sussex. His research is focussed on metallo-organic chemistry and mechanism, and specifically in chiral systems for enantioselective catalysis, polymer synthesis, materials science and healthcare. He has interests in how universities and industry can work together, and is Director of Warwick Chemistry’s EPSRC funded PhD with Industrial Collaboration, and also of Warwick Knowledge Transfer Secondments.

  • Martin A. Walker

    Martin A. Walker is an assistant professor of organic chemistry at the State University of New York at Potsdam. He previously worked in the fine chemicals industry for 12 years. His interests center on organic synthesis methodology, particularly green chemistry, as well as chemical information. He is active on Wikipedia, where he contributes to chemistry content and coordinates the Wikipedia 1.0 project, preparing offline releases of Wikipedia.

Stephen Caddick, Peter Scott, Kevin Booker-Milburn and Max Hammond were the original founders of SyntheticPages.org, an online database for chemical transformations. The data from SyntheticPages has been used as the seed data for ChemSpider SyntheticPages.

Buy me a Coffee

I’m happy to announce ChemSpider SyntheticPages. We are releasing as a beta for the present and in READ-ONLY mode. Shortly we will release a version that allows you to deposit your own procedures (together with documentation and a couple of training videos). We are out to create a major resource for chemistry in terms of synthetic procedures. We welcome you to use, contribute and participate. Please note that this is a beta and even though we are hoping for only minor issues all feedback is welcomed. Enjoy!

ChemSpider and SyntheticPages announce collaboration supporting synthetic chemistry

CAMBRIDGE, United Kingdom., February 2nd, 2010 – The Royal Society of Chemistry (RSC) today announced the release of ChemSpider SyntheticPagesbeta – a community resource of reaction synthesis procedures.

The launch of a beta site is the result of a collaboration between ChemSpider (the foremost and free online structure centric community for chemists) and the original SyntheticPages (www.syntheticpages.org). The partnership, which sees ChemSpider host the content from SyntheticPages, furthers their jointly held missions: to provide rich, high quality chemistry resources for the synthetic chemistry community.

A search of ChemSpider SyntheticPagesbeta allows identification and detailing of the experimental procedures for the synthesis of specific chemical compounds. The database has been seeded with SyntheticPages.org data and will be expanded by inclusion of data from journal articles published by RSC. Researchers will also be able to deposit their own synthetic procedures to the site.

Using online semantic markup technologies and integrating to the ChemSpider database will allow interactive display of chemical structures, spectral data and a multitude of related data. Scientists can comment upon a growing resource of interactive synthetic processes, while leveraging the rich resources contained within the ChemSpider databases.

“We are very pleased to partner with SyntheticPages to provide a reaction database for the community. Students, teachers and researchers will have access to a free, highly curated database of synthetic procedures populated by curated depositions from the community and abstracted from RSC publications” commented Dr Antony Williams, Vice President of Strategic Development, ChemSpider.

He added “Our editorial board is made up of active synthetic organic chemists including the original founders of SyntheticPages. Their leadership and guidance will mesh our expertise in web-based technologies supporting semantic chemistry with deep knowledge and understanding of synthetic chemistry.”

Professors Kevin Booker-Milburn (University of Bristol), Stephen Caddick (University College London, UCL), Peter Scott (University of Warwick) and Dr Max Hammond are the original founders of SyntheticPages.

A spokesperson for the group commented “When we started, we believed that there was a place for an interactive database which would allow synthetic chemists who carry out reactions, to find procedures that work. Our aim is to develop a web-based resource that will be on every synthetic chemists’ desktop, and will be regularly used by experimentalists over the world. Our early work has set the stage to achieve this goal and we look forward to the collaboration with ChemSpider to develop a resource that will help sustain high quality synthetic chemistry worldwide”.

“We are particularly grateful to CEM Microwave Technology UK without whom we could not have developed SyntheticPages. We are also grateful to our thousands of members and users who have helped sustain the SyntheticPages endeavour and we look forward to even greater success with ChemSpider.’’

ChemSpider SyntheticPagesbeta is released in beta form for feedback from the community at www.chemspider.com/syntheticpages .

For more information contact:

Dr Antony Williams
VP Strategic Development, ChemSpider
Royal Society of Chemistry
Thomas Graham House, Science Park, Milton Road
Cambridge CB4 0WF, UK
Tel: +1 919 201 1516
Email: williamsa@rsc.org

Notes for Editors

About the Royal Society of Chemistry

The Royal Society of Chemistry (RSC) is the largest organisation in Europe for advancing the chemical sciences. Supported by a worldwide network of 45,000 members and an international publishing business, our activities span education, conferences, science policy and the promotion of chemistry to the public.

www.rsc.org

About ChemSpider

ChemSpider offers a structure centric community for chemists to resource data. Offering access to almost 21.5 million unique chemical entities from over 200 data sources and by providing a platform for crowd sourced deposition, annotation and curation, it is the richest source of integrated chemistry information available online. ChemSpider delivers data and services to enable the semantic web for chemistry.

www.chemspider.com

About SyntheticPages

SyntheticPages is a freely available interactive database of synthetic chemistry for the dissemination of practical and reliable organic, organometallic and inorganic chemical synthesis, reactions and procedures deposited by synthetic chemists. Synthetic methods on the site are updated continuously by chemists working in academic and industrial research laboratories. Synthetic pages encourages submissions from graduate students, postdocs, industrialists and academics.

Buy me a Coffee

aileendayI’m Aileen Day and I’m one of the Royal Society of Chemistry’s Informatics team who are working with ChemSpider. We can loosely be defined as chemists who have picked up enough computer programming to make our lives and those of people around us a bit more exciting and less tedious. Our job is to develop new tools to help viewers and authors of articles in our journals, and our publishing editors. Probably the most high profile example of these tools is the development of Project Prospect.
So the long-term plan for us and ChemSpider is to fully integrate Prospect (and RSC publications) with ChemSpider so that a user can seamlessly bounce back and forth between finding compounds of interest using the ChemSpider search and selection tools and finding more information about them in our journals amongst other sources. Also, to improve the functionality of and content of everything we can along the way (ChemSpider, Prospect etc.).
As a first step of this I’m currently developing a way to automatically deposit the primary (most important) compounds in our prospected articles into ChemSpider, with publication information about the RSC article, including a link back. I’ll keep you posted as we make progress…

Buy me a Coffee

We mailed out the first issue of the ChemSpider Newsletter in January which was packed with info on what’s happening with ChemSpider and tips on how you can get the best out of ChemSpider. To make sure you receive your personal copy of future issues by email please make sure to register.

ChemSpider Newsletter

Buy me a Coffee

Over the past three years I’ve carried a double-edged sword on the ChemSpider Blog: the honor and the burden.

As anyone who runs a blog would likely tell you hosting a blog can take a lot of time and effort, especially if you are passionate about communicating. Fortunately, since ChemSpider was acquired by the Royal Society of Chemistry we now have a lot more people involved with the platform including support staff and our colleagues in the Cambridge, UK-based Informatics team. Since we are working hard to further integrate various processes, systems and projects it makes sense that more of the team discussing our activities around ChemSpider can post here. In particular there are a number of activities going on regarding the technical aspects of ChemSpider development that will start to show up on this blog and we encourage your participation, comments and feedback. ChemSpider is, after all, about community participation so do engage us!

Over the next few days a number of my colleagues will introduce themselves to the readers of this blog. I welcome them all to the “honor and the burden”…it’s a pleasure to share this space with them.

Buy me a Coffee

In the first of many integration projects presently underway inside the RSC to bring together the benefits of ChemSpider with existing systems we’re happy to announce that the Prospected compound pages are now using structure images from ChemSpider as shown below. We spent a lot of time creating aesthetically pleasing structure images for ChemSpider and especially for display on webpages and blogs so we’re happy to see them show up in other venues too.

We unveiled the ability to embed chemical structure images as well as embedding spectra last year. Now there are multiple blogs using the embed functionality, structures are starting to show up on Wikipedia and our web services are being used for structure image retrieval. We encourage you to make use of the resources we are delivering and any feedback.

prospect

Buy me a Coffee

I will be speaking in a Science Commons symposium on the Microsoft Campus in Redmond in February. If you are interested in hearing John Wilbanks, Cameron Neylon, Heather Joseph, Jean-Claude Bradley, Stephen Friend and myself talking about accelerating scientific discoveries come and join us.

sciencecommons

Buy me a Coffee

A couple of days ago I came across a video on YouTube about “Water Marbles”. I’ve inserted it below…I recommend watching it…it’s excellent!

It’s excellent because by time I had finished watching this I was both excited and confused. Confused because how could I not of heard of this experiment. Even if it was to work why were those spheres so big and uniform? Excited because I’d been looking for some good kitchen chemistry to do with my kids and this would be a great example. I couldn’t really get my head around how the observations were working but on a rushed grocery expedition prior to going into ScienceOnline2010 #scio10 this part weekend I threw everything necessary into the grocery basket to repeat the experiment.

At ScienceOnline2010 I was involved in a number of discussions, as usual, regarding data quality, curation and assertions….this being based on my experience with curating the ChemSpider database. Today I sat in on a discussion entitled “Getting the Science Right: The importance of fact checking mainstream science publications — an underappreciated and essential art — and the role scientists can and should (but often don’t) play in it – Rebecca Skloot, Sheril Kirshenbaum, and David Dobbs.” it was an interesting exchange with comments such as “newspapers and magazines don’t check facts” and the urban myth that a one minute kiss burns 26 calories while the fact is that a Hershey’s Kiss contains 26 calories.

Post ScienceOnline2010 I got home this afternoon to find my kids desperately wanting to do kitchen chemistry so, with pessimism I started to work through the experiment with them. They mixed and stirred and cooled and heated. They got to see a lot fizzing and to see crystals grow which they thought was great. It of course failed dismally as it has for many other people, including this guy, but they had a great time. In parallel I was doing some fact-checking to see whether or not to prepare them for disappointment.

There have been a lot of exchanges online about this topic of water marbles with chemists exchanging concepts about the science behind it if it did work. See here for example. The video has gone viral across many sites. Very impressive for a hoax really…and it did get me interested in doing kitchen chemistry. The truth is a lot easier though…and still good chemistry! Watch Steve Spangler in action below…

The polymer beads can be bought here.

There’s more Kitchen Chemistry to come but I think I’ll stick to some of Theodore Gray’s guidance …maybe time for some Mad Science at home

Buy me a Coffee

I’m off to ScienceOnline2010 in a few minutes. It’s the last day of the conference and the experience has been a highly positive one. I’ve finally met people face to face that I have been connected with for over 2 years….and congruency is always good…they are as interesting, passionate and generally nice people face to face as they are online. I also managed to catch up with a number of old friends. I got to meet some new people focused on changing the flow of communication for ScienceOnline and working hard to do so. #scio10 is different….there’s an energy in the air that I haven’t experienced at any other scientific gathering other than SciFoo. This is an audience that is introducing me to social networking tools that I’ve never heard of…that doesn’t happen often. It has to be that over half the attendees are twittering. iPhones are everywhere. Flips are out capturing video in the sessions and are uploaded online shortly thereafter. The conversations are open, opinionated, full of energy and motivating. This is MY type of conference and I’m fortunate to live less than half an hour away.

The dinner event was fun, giggly, five minute “Ignite” talks were given (I gave two …one on Curating Chemistry online and one with JC Bradley regarding the spectral game). The first of those is linked here and shown below.

Today I will be giving a live demo of ChemSpider to anyone interested and around at the end of the conference. It’s nasty weather so people might be leaving early.

I found myself a virtual running partner for my 1000 miles in a year challenge assuming my calf muscle tear heals. We’re going to try and figure out how to raise money for asthma. Anyone want to join us as to form a virtual team let me know…

Bora and Anton have done a tremendous job organizing the conference. Clearly there is a great team supporting them and the Sigma Xi facility is excellent. Terrific conference all around….glad I spent the weekend this way…

Buy me a Coffee

scienceonline2010It’s one week to ScienceOnline 2010. Last year I missed it because of the threat of weather and this year I’ll likely be hobbling in on crutches. I’m listed to give a few presentations/demos and a joint session regarding Citizen Science and Students with Sandra Porter and Tara Richerson. I’m going to have the chance to catch up with people I know such as Cameron Neylon and JC Bradley who’ll be covering Open Notebook Science. I notice that Bill Hooker is in town and look forward to connecting with him too. I’ll get to meet Hope Leman who runs the blog “Significant Science” and released a blog this weekend regarding an interview with me. It was a real pleasure to work on that with Hope.

ScienceOnline is THE place to be to discuss online science based on what I’ve heard from previous attendees. It always fills up early, is incredibly well organized in terms of workshops, guest attendees and social events based on what I’ve seen. I am looking forward to experiencing the event, sharing space with some of the leaders in the domain of online science and seeing some old friends. One week to go…lots of preparation work to do.

I’m presently building a list of examples of Citizen Science in Chemistry. If you have any examples you believe are worth highlighting please feel free to send them through. Thanks

Buy me a Coffee

Wired Magazine is my favorite monthly read. I get a lot of magazines delivered to the house for our family to browse through and these include Popular Mechanics, Popular Science, Science Illustrated and then additionally Chemistry World, C&E News, Drug Discovery News and a lot of the other trade magazines. Nevertheless, after the books I am reading (and I am presently reading Dr Mary’s Monkey as a follow on regarding the SV-40 cancer causing monkey virus in polio vaccines) Wired magazine is always the next thing I pick up.  It’s an easy read, some great short snippets for when I’m sitting on a stationary bike flipping pages or some long interesting articles, always well written. I recently read an old Wired magazine that had been on my stack for a few weeks and wish I’d read it earlier. We’ve been discussing the importance of user interface on ChemSpider and it’s impact and influence on the users of the website. This connected to the article on Craigslist that was covered in Wired Magazine.

Now, if you don’t know what Craigslist is then how about eBay? I’ll assume you know, and use, eBay. I use eBay…I like it. I’ve used Craigsist and like it, but for a different reason than I like eBay. Here is an interesting statement about Craigslist from the article: “With more than 47 million unique users every month in the US alone—nearly a fifth of the nation’s adult population—it is the most important community site going and yet the most underdeveloped.” The article goes on to tell the story about how confusing the site is, how poor the aesthetics are and how non-Web 2.0 it is in terms of integration access etc. I recommend it as a fun read, if nothing else to get a handle on Craig Newmark, the interesting (and VERY rich man) behind the initial concept. As a historical article regarding how early technology can morph over time into something more flashy but not necessarily more successful it’s a great read.Wired was convinced that people would want to give some input on how Craigslist should be improved and set up their Extreme MakeOver: Craigslist Edition for user comments. I doubt that Newmark and colleagues will pay much attention and, based on stats available to date, they don’t need to.

Another interesting read is a separate article regarding eBay vs Craigslist and the fact that Google and Microsoft actually tried to get into the same sector and both failed. What’s the magic, the secret sauce, the USP (unique selling points) for Craiglist? I’m read a number of suggestions but am not sure of the conclusions. I think its a combination of: 1) old and less complicated technology for novice users (searching means scrolling in a lot of cases) 2) traction …it’s been around a long time and 3) price for people to post ads. The bottom line though, of relevance to our discussions, is that “it ain’t the user interface!”.

Let’s be honest, technology is fun, especially when you work in our domain of building an internet for chemistry. Over the past few years I have upgraded from computer to computer, operating system to operating system (with Vista the worst transition but now loving Windows 7), from browser to browser (i have three installed: IE8, FireFox and Google Chrome with FF my preferred). I would say that while I am not at the bleeding edge of technologies I have access to more advanced systems than the majority of users in schools, homes and the rest of the world especially when taking into account that I have good, solid high speed access, both wireless-N and cabled in our house. If you truly want to see how a site works in the “hands of the masses” it is necessary to look at it on another computer where the latest and greatest browser isn’t installed and they are still running on 512Mb of RAM. In my new “personal adventure” of running 1000 miles in a year I am using the NikePlus website to track my performance but it uses so much Flash, so much animation and “looks” so modern and beautiful that I am struggling to use it even on my most recent laptop. It needs a “dumb down” button (maybe its there but I’m dumb enough to not see it).

We know we need to change some of the ChemSpider website for ease of navigation, for ease of use and to cater with all of the browser dependencies that we see with just things such as copy and paste of long strings, word wrapped strings etc. They can all be fixed. We know that there is an abundance of functionality on the site that only a fraction of the user base will care about. Our focus since starting the ChemSpider project was to establish a high-quality dataset (much progress but a long way to go), provide useful functionality to our diverse user base (lots in place, more to add, some to remove), provide a “successful” experience that meant that users could get answers to questions/queries they asked and that the experience wouldn’t so challenging or mundane as to provide no value. Feedback to date suggests we’re doing okay but we’d like your feedback. Ultimately I’ll likely assemble this into a SurveyMonkey questionnaire but for brevity and early feedback I am interested in your comments to some of the following questions

1) What is your favorite piece of functionality on ChemSpider?

2) What is your LEAST favorite piece of functionality on ChemSpider?

3) If there was one new function you would like to see added/improved what would it be?

4) Assuming a scoring system of 1 to 10, 10 being the best, how well does the ChemSpider interface support your usage of the system?

5) Which public dataset would you most like to see integrated to ChemSpider?

Any other comments are of course welcomed. We will be working on usability over the next few months and it’s hard to please everybody but we’ll do what we can with the resources we have. A Survey Monkey questionnaire will show up in the future with more questions. Watch this space and check out the Craigslist article…I think you’ll enjoy it.

Buy me a Coffee

In the recent rollout of functionality we added to the home page statistics regarding the number of various types of spectra that have been added to ChemSpider as well as updates of new data associated with data sources. We will likely optimize these displayed further in the future but this is an initial display for the time being. It’s rather impressive how many different types of 2D NMR data are being uploaded to the database.

statistics

Buy me a Coffee

Part 4 in the exposure of new ChemSpider functionality from the recent update. We have been using the ACD/Labs Structure Drawing Applet on ChemSpider for the past three years. It’s been a great piece of technology and was one of the first applets, possibly the first structure drawing applet ever released. However, it’s old technology and we have been encouraged by our users to use a more modern applet. We are very fortunate to have been granted the right to use the Symyx JDraw applet and have had the pleasure of working with Keith Taylor and James Jack. For the time being we have left two applets online for the users to try out and provide feedback on. You can choose the ACD/applet or JDraw by selecting via the interface as shown below. Feedback welcomed.

symyx jdraw

Buy me a Coffee

social widget

Following on from other posts in this series from this week I’m going to continue to list new functionality over the holiday season. I’ll continue with the “Social Widget”. What IS the Social Widget? Well…it’s this thing to the left….it is an AddThis Button that is available for every compound page on ChemSpider now. If there is a particular chemical of interest on ChemSpider that you want to include into your social networking then you can do so by choosing the social networking site of interest and “adding” the link in there. For some it posts the link and for others it posts a thumbnail of the structure there that is linked back directly into ChemSpider.

So, if I posted to Friendfeed it will send the link directly into Friendfeed. I just did it..worked perfectly. For Facebook it actually carries the thumbnail as shown below on my Facebook page. SO, deposit some of your molecules onto ChemSpider and let the world know! Add some data, tell a story, post a reaction…and use AddThis to tell your network!

photochromism

Buy me a Coffee

Following on from my previous post regarding new functionality on ChemSpider, the last one regarding improved integration to SureChem patents, I am happy to announce that we have improved the Pubmed integration. Previously we would stream way too many articles from Pubmed for some compounds. For example, retrieving articles about cholesterol would result in too long a page of Pubmed articles being displayed.We have now limited the number of articles retrieved and simply put a link at the top of the page for you to retrieve the rest of the links.

pubmed1

In this case when you click the link we initiate a full search using the Entrez Life Sciences API. The results for cholesterol are shown here. A small but user-friendly improvement…More to follow.

Buy me a Coffee

I’d never published in an RSC journal until recently when Sean Ekins and I published in Lab on a Chip. The process was good…fast and efficient. Now, I am an employee of the RSC now but these are objective comments and I’m looking forward to publishing more with the RSC. I do still publish with my co-authors with other publishers because of the nature of the work we do – cheminformatics fits well into ACS’ JCIM and the Journal of Cheminformatics and NMR papers still fit well into Wiley’s Magnetic Resonance in Chemistry. Our J Cheminf paper is still the most accessed article in that journal.

I was impressed to get a letter from the Editorial Director of the RSC in my inbox last week. It was directed to me as an author in an RSC journal. What was impressive was the fact that he took the time to issue this to all authors as well as the fact that the letter discussed the impact factors for RSC now matching those of the ACS. Very impressive and a nice touch from Jim Milne!

jim milne

Buy me a Coffee

While some say “Silence is Golden” some of us find it deafening! One of my common statements regarding Press Releases and political commentaries is there is as much said in the “unsaid”. Why this lead in to this blog post? Well….the truth is we haven’t been very productive in the past few weeks with the delivery of new functionality onto ChemSpider and people have been asking me why we haven’t been so prolific with our updates. Well….in this case Silence is Golden based on the new functionality and data rolling out soon!

Historically we were introducing new functionality every few days and rolling it out with a “continuous beta” approach to delivery. We were also working on only three computers and were challenged with issues of uptime and handling. At the RSC we have access to development, test and live environments, we have a stable compute environment supporting the system that provides power support where previously we would have been at risk of outages. We have a support team who have “got our backs” and we are not dealing with all of the issues regarding keeping the environment healthy for the ChemSpider platform. With our new hosted environment and the drive to move away from our previous constant and ongoing updates to a more controlled process for rollout, specifically including internal testing prior to going Live, we have been working on procedures to ensure the best delivery. In parallel we have been working on a series of internal projects that are very exciting and you should see the results soon!

With our new processes in place, and our new systems now established we have been working on new functionality development and are happy to announce that we will now be moving towards regular updates, every few weeks. We’re starting this week with the roll out of a set of new capabilities for you to try out. I’ll highlight these in a series of blog posts over the coming days.Let’s start with this one…

We are happy to announce an improved integration to the patent web service provided to us via our collaboration with SureChem. We announced our initial integration to this service at the ACS meeting last fall in Washington and received a lot of positive feedback regarding the implementation. That rollout only provided integration to a subset of the entire collection, the USPTO. SureChem host data from a number of patent agencies and the collection includes USPTO Granted, USPTO Applications, European Granted, European Applications, WO/PCT and Japanese Abstracts. Thanks to their web service we now have the ability to retrieve information regarding those sources also. The image below shows the patents retrieved for Xanax. Check it out…give us your feedback and extend holiday cheer to SureChem also for their contribution to the community.

patents

Buy me a Coffee

I hate the drive to Washington DC from Wake Forest North Carolina where I live. At the right time of day it’s kind of fun actually…some good music, my iPhone to talk to people on the drive and a chance to think, think, think without the interruptions from skype, email and even my own lovable children. BUT, at the wrong time of day a 4 hour drive becomes a 6 hour drive and I spend a lot of time parked up on the highways trying to move. It’s quite simply, hell. So, this week I decided to take an Amtrak train up from Raleigh to Washington to stay with my colleague Valery Tkachenko the night before my presentation at the FDA.

It didn’t start well. The train left the station 30 minutes late. The guy sitting next to me on the train clearly hadn’t bathed for a few days but the train was full so there was initially nowhere else to sit. He slept the entire way anyway so we didn’t exactly engage much until he lay his head on my arm and proceeded to salivate in my general direction at which point I, the one that was awake, suffered an intense myoclonic jerk and launched him to the other side of his seat to dribble into the central aisle. There was no wireless on the train. Food was expensive…a coke for $2. Stops were frequent and it took over 8.5 hours for me to get to Valery’s house. I did get a lot of work done on the ride but I’ll conclude it with NEVER AGAIN…I’d rather drive. Compared with the quality of train rides in Europe, of which I have had many of late, Amtrak leaves a lot to be desired. Enough whining…

I was in Washington to give a talk at the FDA in a symposium with a number of other online database teams. These included PubChem, ChemIDPlus, HSDB, DailyMed and one I hadn’t heard about yet… PillBox. I really enjoyed the talk about Pillbox…check out the site. Great design, simple and intuitive flow and a great vision. Loved it!

The presentation I gave was called “ChemSpider and How The Wisdom of the Crowds CAn Improve the Quality of Chemistry on the Internet.” The talk is on SlideShare with about 40 of my other talks here and is embedded below also.

Buy me a Coffee

Yesterday evening from 6pm until 8pm I participated in a remote presentation to about 30 students sitting in a room at Drexel University as a result of an invitation from JC Bradley. Two of us were speaking…Rajarshi Guha and myself. The entire session was conducted using shared Webex access and skype. The technology aspects worked very well. While the experience of sharing a room with my partner speaker was lacking there was definitely no lack of collaboration to make this work!

Rajarshi’s presentation was regarding “Molecular representation, similarity and search” and is posted online at Scivee here.

My presentation was entitled “Citizen Scientists and Their Contributions to Internet Based Chemistry” and covered the range of what is the status of chemistry on the internet, what databases are out there to look at, what is the status of quality on said databases, an overview of real time searching on ChemSpider, how to curate and deposit to the platform and some insight into what will be coming soon. The talk is shown below and directly on SciVee here

Buy me a Coffee

My friend and colleague Sean Ekins and I wrote a perspective for the RSC’s Lab on a Chip journal and it was released as an Advance Article, as Free Access, this evening.

The perspective is entitled “Precompetitive preclinical ADME/Tox data: set it free on the web to facilitate computational model building and assist drug development“ . The title is self-explanatory in terms of what we are trying to communicate. The paper is online now and available here.

ADME

Buy me a Coffee

I have been at the German Conference for Cheminformatics for the past three days. The conference is in Goslar. I twittered the conference using #goslarcheminf and it appears that there was little interest in twittering here…seems like it’s an “American” thing to do. I gave a presentation entitled “ChemSpider – Building a Foundation for the Semantic Web by Hosting a Crowd Sourced Databasing Platform for Chemistry” and have put it on SlideShare here. The abstract for the talk is below as well as the embedded Slideshare widget for the talk. This talk was a lot less rushed than usual…not just 20 minutes and I personally enjoyed giving this talk to the audience. Commonly I feel that the talks I give are very rished and I only get to scratch the surface of what we are up to with ChemSpider. It’s amazing how an additional 15 minutes allowed me to expand on the issues and the work. The presentation drew a lot of questions and attention after the session and I’m hoping that many of the discussions regarding collaboration and depositions of new data come to fruition.

Abstract

There is an increasing availability of free and open access resources for chemists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge.

There are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them.  Despite the fact that there were a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness was lacking in many regards. The intention with ChemSpider was to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources.

ChemSpider has enabled real time curation of the data, association of analytical data with chemical structures, real-time deposition of single or batch chemical structures (including with activity data) and transaction-based predictions of physicochemical data. The social community aspects of the system demonstrate the potential of this approach. Curation of the data continues daily and thousands of edits and depositions by members of the community have dramatically improved the quality of the data relative to other public resources for chemistry.

This presentation will provide an overview of the history of ChemSpider, the present capabilities of the platform and how it can become one of the primary foundations of the semantic web for chemistry. It will also discuss some of the present projects underway since the acquisition of ChemSpider by the Royal Society of Chemistry.

Buy me a Coffee

For those of you who read this blog you will be aware that it can take a lot of time just to get a single chemical curated against its correct associations of chemical names and synonyms. I’ve shown this for vancomycin, Taxol (1,2,3), Ginkgolide B and it is presently underway with Digitonin, though not yet complete. Working on one structure is hard enough. Building a database of a few thousand curated structures is difficult work yet the EBI did it, and did it well when they built ChEBI. ChEBI is also not perfect as we discovered working on vancomycin and I still find occasional small issues.

The EBI recently released the ChEMBL database. This is a much bigger resource as described at the home page for the resource here. The site states “ChEMBL is a database of ca. 500,000 bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature and the data made available due to funding by the Wellcome Trust.” It is MUCH harder to curate larger databases and 1/2 a million records is a challenge.

I downloaded the data from the FTP site and took a browse of the data. There are definitely structures in the data file that we don’t have in ChemSpider but I found an issue with charge balance for many hundreds of records where the counterions were charged (for example, chloride or bromide) but the primary component was neutral. An example is here where the compound is named as a hydrochloride but the compound has the chloride anion. I think this likely arises from treatment with some type of standardizer so it should be a matter of changing the standardizer settings and regenerating. We deal with over 23 million compounds and have been through such issues ourselves when it comes to generation of structure images.

For an example of a rich record in ChEMBL take a look at this record showing the target, assay, activity type, value and reference all listed. ChEMBL is sure to be an invaluable reference for the Life Sciences.

Buy me a Coffee

I have never met Warren DeLano. But, I have respected him from afar for a long time. Warren is the developer of PyMol, an Open Source molecular visualization system that has made enormous contributions to the community and can produce stunning visualizations of Proteins. His impact on the field of protein visualization has been recognized many times by the community and his tools are used in labs all over the world. He has garnered respect across our community.

A few months ago I had the opportunity to spend an hour on the phone with him after he had made such positive comments when the RSC acquired ChemSpider. We talked about Open Science, Open Source and models of business. We talked about the adventure of trying to change the world one step at a time by making our humble contributions to the world of science. By the end of our conversation I knew that when I met Warren we would be able to talk for many more hours as we shared many common views and, primarily, a want to make a difference.

Today I learned of the sad news that Warren had passed away. Despite the fact that I hadn’t yet managed to sit with Warren face to face I was immediately  saddened. My truth is that there is a specific type of shock I feel when someone younger than myself passes away. Warren and I talked about the impact of our chosen career paths on our relationships with our wives and the hours spent in front of a screen instead of spending them with those we share our lives with. We both reflected on the fact that we have given too much to the keyboard over the years driven by our need to make a difference. Warren’s hard work and superior programming skills and are paralleled by the fact that he was clearly a charitable contributor to science by giving his code away to the world and was, even based on only one phone call, a kind man.

My thoughts go out to his wife and family for his loss.

Buy me a Coffee