A Faster, Superior Literature Search on ChemSpider
Posted by: Antony Williams in ChemSpider Chemistry, ChemSpider ServicesMy colleague Will originally developed the ChemRefer service. When ChemSpider started up Will brought the ChemRefer technology and joined us to help expand the capabilities of our services. We integrated ChemRefer and released the text searching capabilities. Will indexed more and more journals and grew the index by 100s of thousands of articles. Unfortunately the downside was that the speed of the search decreased dramatically. Also, we kept hearing the comparison with the Google service and that their advantage was in their citations. So, Will has taken a few months off from indexing and has focused his efforts on developing his technologies to dramatically improve the speed of searching as well as implementing a system for recognizing citations. The system has been made available online for beta-testing just in time for the ACS meeting here in Salt Lake City BUT it is not yet integrated into ChemSpider.
I have performed some basic tests focused on searching chemical names initially. The literature search on ChemSpider has a lot more journals indexed but in order to perform the comparison I searched ONLY the RSC and Journal of Biological Chemistry articles since that is all we have indexed so far on the new system. The search results were as follows. The numbers compare number of hits for the old versus new literature search. The new search has indexed the latest RSC and JBC articles also so in theory should provide more hits.
Searching on Taxol: 626 hits found in 22 seconds (OLD) vs 717 hits in 1 seconds (NEW)
Searching on phenolphthalein: 47 hits found in 5 seconds OLD) vs 1514 hits in 1 second (NEW)
Searching on benzene: 846 hits found in 75 seconds vs 15260 hits in 4 seconds (NEW)
Clearly the searches are MUCH faster with the new system but it is also returning much more results. These are very early results and we will explain more about the system, the results and our future development shortly…
Try out the new system here for now and send us feedback at info@chemspider.com. Thanks
Buy me a Coffee![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=0f6dbaae-fe70-4394-8586-e4313eb0fc8b)
With the RSC depositions came many beautiful structures - highly symmetric, complex and just plain “pretty” to a chemist. But a high level of complexity also arrived with the collection and while many InChIs could be converted to their associated connection tables the act of converting the InChIs could add additional stereochemistry and ![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=6f2547fa-4444-4125-a89f-8a3d337586b7)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=59e66955-f2e4-4b5b-ad98-87d45f561bdb)
We’ve been working with Jean-Claude Bradley and his Open Notebook Solubility Challenge group to assist where we can. This has included enhancing some of our services (though there is more work to be done…), populating data into ChemSpider and, now, linking us up to the Data Tables built by Andy Lang (of The Spectral Game fame…we’re quite a team).






![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=fa477a54-2c8a-42e4-8beb-4c487a441b33)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=d29d2a38-59c8-4453-8577-5b8fc56fa69f)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=c44e1304-4920-43ce-b41a-4958abcb5567)











Inserted above is a screen grab of part of a record showing the ChemSpider database as the source of the structure. CAS have rigorous expectations regarding how they select what chemical entities should be inserted into their database. While I don’t know this list of definitions this structure clearly meets it. The structure above is on ChemSpider 


Entries (RSS)