Copyright©2009 Antony Williams
There have been other comments about Wolfram Alpha and it’s support for Chemistry (1,2 and others) but I have remained rather quiet until now about my experiences with Alpha for a couple of reasons. First of all I’d rather let the service settle down a bit before poking at it too hard. My experiences of going live with ChemSpider were definitely that it takes a while to stabilize the system and address some of the earliest feedback. Also, knowing that I would be at Scifoo and aware that Theodore Gray would be there I had hoped to see Alpha in action. I wasn’t disappointed. Yesterday Theodore drove the system in front of an audience including a number of interested scientists, members of Google and, Peter Murray-Rust and myself from Chemistry. Theo had no fear…essential for live demos. He was asked questions and he did took the plunge, did the search and with the rest of us celebrated a successful search, a weird result and just plain wrong. It was ALL good. I am impressed. I am impressed by that they are out to achieve with Wolfram Alpha. I am convinced that what they are doing with Alpha will contribute to science and mathematics in general and that Chemists will be using this system when they have more awareness of it.
For a general intro to Alpha see the presentation here.
So, some examples of interesting searches:
1) A guy in the room had asked the question “What is the largest land mammal?” and had not received an answer a few weeks earlier. Now Theo posed that question and got the answer here. Nice! Now, I took that to mean that they were keeping logs of failed queries and tweaking…confirmed by Theo. VERY nice.
2) Peter Murray Rust had previously blogged about bad results from his searches (searching on dibromoethane for example). When he repeated his searches in the session hosted by Theo he acknowledged that he was pleased that they had fixed the issues he had previously blogged about. This is how modern systems should be …moving quickly.
3) Searching on names…for example, what is the number of people with my name…my spelling is Antony NOT Anthony. See here for the results.
4) What is the return per employee for Google versus IBM. It’s in this query: http://www35.wolframalpha.com/input/?i=GOOG+IBM
5) What are the chemical structures of Taxol? Methamphetamine? Cholesterol? Buckminsterfullerene? You get answers for all. The organic molecules all give images of chemical structures. The connections in all cases are correct but I see no evidence of stereochemistry anywhere across the chemical structures on the database..it doesn’t mean it’s not there but I couldn’t find it.
So, for chemistry, am I impressed. Yes I am. I’m not worried right now that Alpha is not dealing with stereochemistry…I am sure they will layer that on later. It is clear based on most of the results that I have seen that there is some GOOD curation of the data going on. According to Theo there are chemists on staff and they are curating the data coming in. Hallelujah! If you look in the Source Information for Taxol you see a LONG list of sources of chemical source information and the primary source is the Wolfram Alpha Curated Data.
There is much that can be done to help Wolfram Alpha to have better Chemistry. They have a HARD job ahead of them if they are going to sample the Public Databases to grab quality chemistry. It’s in there for sure but it’s hard to find. What could come out of ChemSpider and Wolfram Alpha working together?
1) If we could get the list of “compounds” in Wolfram Alpha then we can provide chemical compound connection tables with all necessary stereochemistry etc.
2) When we pass back the compound list then we can pass back ChemSpider IDs and get them listed as identifiers alongside the PubChem CID. In theory it would be good to get these linked back to ChemSpider so that a user can come and find associated articles, analytical data, the wikipedia article, predicted and experimental properties and so on. This is where ChemSpider’s integration would be of value.
3) There is an opportunity to expand the chemistry in Wolfram Alpha by passing a subset of ChemSpider compounds to be added to Alpha. Certainly I don’t think that Alpha should host all 21.5 million of our compounds for the reasons I have enumerated many times on this blog. See my last post about the 54 versions of the Taxol skeleton…there should be only one Taxol. But, there may be a way to subset “important chemistry” and get it into Alpha. OR, maybe they do want it all?
There are clearly opportunities to help expand the chemistry and I hope we have the chance. I think Alpha is incredibly ambitious. But why not be ambitious? ChemSpider was ambitious too and look what we have done with three servers in a basement…it’s a whole lot less resources that Wolfram are throwing at Alpha. I want them to be successful…a computational engine for the public. Why not….so many of us are asking questions using search engines right now and can’t get anywhere near an answer…Stumble it!