<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ChemSpider Blog &#187; Quality and Content</title>
	<atom:link href="http://www.chemspider.com/blog/category/improving-the-quality-and-content-on-chemspider/feed" rel="self" type="application/rss+xml" />
	<link>http://www.chemspider.com/blog</link>
	<description>Building Community for Chemists</description>
	<lastBuildDate>Fri, 10 Feb 2012 15:52:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>Adding the SORD Database (Selected Organic Reactions Database) to ChemSpider</title>
		<link>http://www.chemspider.com/blog/3537.html</link>
		<comments>http://www.chemspider.com/blog/3537.html#comments</comments>
		<pubDate>Fri, 03 Feb 2012 22:02:39 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[ChemSpider Chemistry]]></category>
		<category><![CDATA[ChemSpider Syntheses]]></category>
		<category><![CDATA[Quality and Content]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=3537</guid>
		<description><![CDATA[We will soon be depositing data from the SORD databases (Selected Organic Reactions Database) onto ChemSpider. This will be done as two separate but related datasets until the SORD data source: Reactants and Products. If you don&#8217;t know what SORD is then who better to explain than Dick Wife, the &#8220;host&#8221; of the SORD database. Dick wrote [...]]]></description>
			<content:encoded><![CDATA[<p>We will soon be depositing data from the SORD databases (Selected Organic Reactions Database) onto ChemSpider. This will be done as two separate but related datasets until the SORD data source: Reactants and Products. If you don&#8217;t know what SORD is then who better to explain than Dick Wife, the &#8220;host&#8221; of the SORD database. Dick wrote the overview article below to provide an overview about what SORD is&#8230;ENJOY!</p>
<p><strong>The Selected Organic Reactions (SOR) Database: capturing “Lost Chemistry”</strong></p>
<p>Dick Wife, SORD B.V. The Netherlands (<a href="http://www.sord.nl">www.sord.nl</a>; <a href="mailto:dick.wife@sord.nl">dick.wife@sord.nl</a>)</p>
<p>A new database is capturing the 80% of Lost Chemistry from theses and dissertations which doesn&#8217;t make it into publications and chemists who contribute their data get access to the entire database for free.</p>
<p>SORD, an independent Dutch company, is carefully selecting the synthetic chemistry focused on Life Science research and making this chemistry available in their Selected Organic Reactions (SOR) Database. For the theses/dissertations which they select, SORD excerpts all of the reactions in the Experimental section are excerpted. This means there will still be a small overlap of data with full publications. There will also be a larger overlap with publications such as Notes, Letters or Communications but these do not contain the experimental details. The SOR Database brings all this chemistry to the desktop, every last detail written by the author.</p>
<p><a href="http://www.chemconnector.com/wp-content/uploads/2012/02/image1.png"><img class="size-full wp-image-712 aligncenter" title="image1" src="http://www.chemconnector.com/wp-content/uploads/2012/02/image1.png" alt="" width="319" height="216" /></a>Some time back, SORD looked at around 300k interesting drug-like compounds in the literature and which countries they had come from, and the native language. The English-speaking countries accounted for only 37% of the total. German/Swiss dissertations are often written in English but this is new. The theses and dissertations in the other languages represent more than half of the total. SORD routinely translates German and French experimental texts into English. They are about to start on Chinese and Japanese translations and, if anyone can give them access to Russian theses, they will translate these as well!</p>
<p>A thesis or dissertation is the result of several years of hard work by a research student under the constant supervision of the research leader whose reputation is at stake if the work described is wrong or inaccurate. It is also examined by a committee who decide on awarding the degree, or not. They scrutinize closely  the Results &amp; Discussion as well as the Experimental sections. The chemistry is reliable.</p>
<p>Advanced Chemistry Development, Inc (ACD/Labs) is partnering SORD in developing this Database. The SOR Database is available for in-house use with ChemFolder Enterprise or on the Internet with ACD/Web Librarian™. This is a screen-shot of a typical SOR Database record in Web Librarian.</p>
<p><a href="http://www.chemconnector.com/wp-content/uploads/2012/02/image2.png"><img class="alignleft  wp-image-713" title="image2" src="http://www.chemconnector.com/wp-content/uploads/2012/02/image2.png" alt="" width="631" height="414" /></a></p>
<p>&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">&nbsp;</p>
<p style="text-align: left;">The Reaction Scheme shows every atom (there are no abbreviations). The Experimental  text is edited to ASCII format and the key parameters (Reagent(s), Solvent(s), yield(s), MP(s) and Optical Rotation(s) are displayed in separate Fields, as are the full bibliographic data, making data-mining possible. There is also a link which enables the user to bring up the PDF of each reaction containing all of the spectral and other physical data which SORD does <span style="text-decoration: underline;">not</span> excerpt. The PDF-EX link is a powerful and unique feature of the SOR Database.</p>
<p>Now some explanation about SORD’s excerption rules. What they call the Reaction Scheme (A + B à C, etc.) contains only the reacting and product compound structures. A Reagent is an essential reaction component of which no part ends up in the product – if it does, it becomes a Reactant! When several reactions are performed before the product is isolated (and characterized) the Reagents and Solvents are listed in Steps. Failed reactions are not excerpted but reactions with poor yields are.</p>
<p>The SOR Database currently contains 170k reactions; the target is one million at the end of 2013. Even this number is a lot smaller than what you find today in the major commercial reaction databases. Back in the nineties, SORD researchers looked at one such large commercial database which then contained 9 million compounds. Sifting through the content for drug-like compounds resulted in just 450k or 5% of the records<a href="#_ftn1">[1]</a>. Size is one database metric; quality is much more important! In the SOR Database, you will only find characterized products – and no polymers, or compounds with no molecular structure.</p>
<p>Users of the SOR Database also have access to the separate databases which contain the Reagents (ca. 3,000) and Solvents (ca. 450) which have been encountered so far. Often a Reagent is a catalyst (organic/organometallic) but they can also be simple entities like bases, acids, ammonium salts, etc. or complex chiral ligands. Authors give Reagents many different names and so each Reagent (and Solvent) in the SOR Database has been assigned a unique name. This enables rapid searches using the assigned names, again a novel feature of the database. Such searches can bring you to really nice chemistry.</p>
<p>As an Example, the second generation Grubbs olefin metathesis catalyst has been given the name Grubbs 2 catalyst. In the current SOR Database, there are more than 500 reactions where it has been used. Some of these are straightforward; some are not and generate novel ring systems like this one from the Martin group at North Carolina at Chapel Hill:</p>
<p><a href="http://www.chemconnector.com/wp-content/uploads/2012/02/image3.png"><img class="wp-image-714 aligncenter" title="image3" src="http://www.chemconnector.com/wp-content/uploads/2012/02/image3.png" alt="" width="539" height="156" /></a>Searches in the Reactions Scheme, or using Reagent/Solvent names and hit refinement brings you to new chemistry which until now was only found on a dusty shelf in a library. The “Lost Chemistry” is now getting smaller as SORD carefully selects and excerpts the reactions which deserve a new life. The SOR Database is essential for novelty searches and it is a powerful supplement for the other commercial reaction databases.</p>
<p>Finally some more good news for academic research chemists; your data will be readily accessible to the whole chemical world who will cite your work in their publications. The chemistry which you never published may be just what others are looking for. Routinely SORD excerpts the complete collection of theses and dissertations from research supervisors; they will be more than happy to see your work appear in the next SOR Database!</p>
<hr size="1" />
<div>
<p><a href="#_ftnref1">[1]</a> de Laet, A.; Hehenkamp, J. J.; Wife, R. L. Finding Drug Candidates in Lost/Emerging Chemistry. <em>J. Heterocycl. Chem. 2000, 37, 669–</em>674.</p>
</div>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2F3537.html';
  addthis_title  = 'Adding+the+SORD+Database+%28Selected+Organic+Reactions+Database%29+to+ChemSpider';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/3537.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Changing Face of ChemSpider</title>
		<link>http://www.chemspider.com/blog/the-changing-face-of-chemspider.html</link>
		<comments>http://www.chemspider.com/blog/the-changing-face-of-chemspider.html#comments</comments>
		<pubDate>Fri, 01 Jul 2011 16:19:11 +0000</pubDate>
		<dc:creator>David</dc:creator>
				<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Vision]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=2845</guid>
		<description><![CDATA[I&#8217;m sure that by now everyone has noticed that the ChemSpider homepage design changed just over a month ago. A few features moved around, the Molecules of Interest section was retired and perhaps most significantly the Search box was given a dose of CSID: 5791, becoming bigger and more prominent. The reason for this wasn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m sure that by now everyone has noticed that the ChemSpider homepage design changed just over a month ago. A few features moved around, the Molecules of Interest section was retired and perhaps most significantly the Search box was given a dose of <a title="Testosterone link" href="http://www.chemspider.com/Chemical-Structure.5791.html">CSID: 5791</a>, becoming bigger and more prominent.</p>
<p>The reason for this wasn&#8217;t just to make the site more attractive (though I think it does look &#8216;prettier&#8217;). Our motivation for the change is to deliver a site that makes it easier for users to interact with and understand. And by doing so, hopefully make it quicker and simpler for you to get your tasks done using ChemSpider. The refresh of the homepage is hopefully illustrative of this: We think that as most users come to ChemSpider to search for information &#8211; it should be easy to get straight into a search, hence the greater emphasis on this feature.</p>
<p>In the next few days we will release another upgrade to the interface which is centered on making it easier to understand the data presented in the compound Record View pages. I&#8217;ll post a blog entry dealing with some of the key features in the next few days.</p>
<p>The development of ChemSpider is an ongoing process, and we are aware that even after this upgrade there will be aspects of the compound Record View pages that will need more work (and also other parts of the site that still need development). It&#8217;s not going to be easy: ChemSpider brings together a rich and varied set of data from a large number of sources &#8211; this poses many challenges. We also realise that there are many different tasks that each of you &#8211; as users &#8211; want to perform, and it is always going to be difficult to reconcile all of the different opinions/needs.</p>
<p>However, we are trying to make the site better for <em>you</em>. And therefore, we&#8217;d really like to know <em>your</em> opinions on the changes (please test new features for a few days first). We welcome your feedback on the redesign either in the form of blog comments or email feedback (chemspider<span class="at">-at-</span>rsc.org).</p>
<p>Over the next week &#8211; keep your eyes peeled for the upgrade and my accompanying blog post which will endeavor to give you a good introduction to the new features.</p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fthe-changing-face-of-chemspider.html';
  addthis_title  = 'The+Changing+Face+of+ChemSpider';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/the-changing-face-of-chemspider.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Access to Infotherm Now Free for ChemSpider Users</title>
		<link>http://www.chemspider.com/blog/access-to-infotherm-now-free-for-chemspider-users.html</link>
		<comments>http://www.chemspider.com/blog/access-to-infotherm-now-free-for-chemspider-users.html#comments</comments>
		<pubDate>Mon, 30 Aug 2010 14:56:46 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[ChemSpider Chemistry]]></category>
		<category><![CDATA[Community Building]]></category>
		<category><![CDATA[How ChemSpider Runs]]></category>
		<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[Chemical Data Tables]]></category>
		<category><![CDATA[ChemSpider]]></category>
		<category><![CDATA[Infotherm]]></category>
		<category><![CDATA[Thermophysical data]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1991</guid>
		<description><![CDATA[Earlier this month I reported on the integration of Infotherm to ChemSpider but at that time it would have been necessary for non-RSC members to pay for the data on Infotherm despite the fact that a search would have provided the links and you could have clicked through to the Infotherm data pages. Some good news [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this month I reported on the integration of <a href="http://www.chemspider.com/blog/infotherm-property-data-now-linked-in-chemspider.html">Infotherm to ChemSpider</a> but at that time it would have been necessary for non-RSC members to pay for the data on Infotherm despite the fact that a search would have provided the links and you could have clicked through to the Infotherm data pages. Some good news from Fiz-Chemie though&#8230;they are waiving the fee for data on pure compounds accessed from ChemSpider and as a result giving access to over 200,000 tables of data. This is a great contribution to the community of ChemSpider users. Thanks Fiz-Chemie!</p>
<p> </p>
<p><img class="aligncenter size-full wp-image-1992" title="infotherm" src="http://www.chemspider.com/blog/wp-content/uploads/2010/08/infotherm.png" alt="infotherm" width="536" height="424" /></p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Faccess-to-infotherm-now-free-for-chemspider-users.html';
  addthis_title  = 'Access+to+Infotherm+Now+Free+for+ChemSpider+Users';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/access-to-infotherm-now-free-for-chemspider-users.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Presentation at the BAGIM Meeting in Boston</title>
		<link>http://www.chemspider.com/blog/presentation-at-the-bagim-meeting-in-boston.html</link>
		<comments>http://www.chemspider.com/blog/presentation-at-the-bagim-meeting-in-boston.html#comments</comments>
		<pubDate>Thu, 26 Aug 2010 12:48:05 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[ChemSpider Chemistry]]></category>
		<category><![CDATA[How ChemSpider Runs]]></category>
		<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[RSC Publishing]]></category>
		<category><![CDATA[Vision]]></category>
		<category><![CDATA[ChemSpider]]></category>
		<category><![CDATA[Drugbank]]></category>
		<category><![CDATA[KEGG]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[PubChem]]></category>
		<category><![CDATA[Public Chemistry]]></category>
		<category><![CDATA[Public Chemistry Data]]></category>
		<category><![CDATA[Public Domain data]]></category>
		<category><![CDATA[Semantic Web for Chemistry]]></category>
		<category><![CDATA[Wikipedia]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1988</guid>
		<description><![CDATA[Last night I gave a presentation at the BAGIM meeting in Boston. The abstract is below together with the embedded presentation from Slideshare ChemSpider &#8211; Is This The Future of Linked Chemistry on the Internet? ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a [...]]]></description>
			<content:encoded><![CDATA[<p>Last night I gave a presentation at the <a href="http://www.bagim.org/next_meeting.html">BAGIM meeting in Boston</a>. The abstract is below together with the embedded presentation from Slideshare</p>
<p><strong>ChemSpider &#8211; Is This The Future of Linked Chemistry on the Internet?</strong><br />
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are now hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of almost 25 million chemical substances, grows daily, and is integrated with over 400 sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for a linked web for chemistry and to provide access to a set online tools and services to support access to these data.</p>
<div style="width:425px" id="__ss_5057586"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/AntonyWilliams/chemspider-is-this-the-future-of-linked-chemistry-on-the-internet" title="ChemSpider – Is This The Future of Linked Chemistry on the Internet?">ChemSpider – Is This The Future of Linked Chemistry on the Internet?</a></strong><object id="__sse5057586" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=bagimpresentation081910forslideshare-100825222052-phpapp01&#038;stripped_title=chemspider-is-this-the-future-of-linked-chemistry-on-the-internet" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed name="__sse5057586" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=bagimpresentation081910forslideshare-100825222052-phpapp01&#038;stripped_title=chemspider-is-this-the-future-of-linked-chemistry-on-the-internet" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="padding:5px 0 12px">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/AntonyWilliams">Antony Williams, ChemConnector</a>.</div>
</div>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fpresentation-at-the-bagim-meeting-in-boston.html';
  addthis_title  = 'Presentation+at+the+BAGIM+Meeting+in+Boston';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/presentation-at-the-bagim-meeting-in-boston.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A Typical Month of Depositions on ChemSpider</title>
		<link>http://www.chemspider.com/blog/a-typical-month-of-depositions-on-chemspider.html</link>
		<comments>http://www.chemspider.com/blog/a-typical-month-of-depositions-on-chemspider.html#comments</comments>
		<pubDate>Sun, 22 Aug 2010 15:49:23 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[ChemSpider Chemistry]]></category>
		<category><![CDATA[Community Building]]></category>
		<category><![CDATA[How ChemSpider Runs]]></category>
		<category><![CDATA[Quality and Content]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1937</guid>
		<description><![CDATA[We deposit a lot of data onto ChemSpider in a  month and the database is growing daily. As an example of the ongoing depositions take a look at what has been deposited in a one month timrframe from July-August. This is simply what has been published by me&#8230;not all depositions. It&#8217;s a pretty good indicator [...]]]></description>
			<content:encoded><![CDATA[<p>We deposit a lot of data onto ChemSpider in a  month and the database is growing daily. As an example of the ongoing depositions take a look at what has been deposited in a one month timrframe from July-August. This is simply what has been published by me&#8230;not all depositions. It&#8217;s a pretty good indicator of ongoing efforts to enhance the quantity of content on the site.</p>
<p><img class="aligncenter size-full wp-image-1938" title="published_in_a_month" src="http://www.chemspider.com/blog/wp-content/uploads/2010/08/published_in_a_month.png" alt="published_in_a_month" width="648" height="619" /></p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fa-typical-month-of-depositions-on-chemspider.html';
  addthis_title  = 'A+Typical+Month+of+Depositions+on+ChemSpider';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/a-typical-month-of-depositions-on-chemspider.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scraping high quality sets of molecules from PubChem</title>
		<link>http://www.chemspider.com/blog/scraping-high-quality-sets-of-moleculesfrom-pubchem.html</link>
		<comments>http://www.chemspider.com/blog/scraping-high-quality-sets-of-moleculesfrom-pubchem.html#comments</comments>
		<pubDate>Wed, 18 Aug 2010 09:37:03 +0000</pubDate>
		<dc:creator>Aileen Day</dc:creator>
				<category><![CDATA[Quality and Content]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1898</guid>
		<description><![CDATA[PubChem is a very large source of compound structures and data, but the quality and reliability of these can be variable. However, within it, some sets of compounds and substances could be trusted more than most because they’ve been deposited by reliable data sources &#8211; for example those deposited by the Nature Publishing Group that [...]]]></description>
			<content:encoded><![CDATA[<p>PubChem is a very large source of compound structures and data, but the quality and reliability of these can be variable. However, within it, some sets of compounds and substances could be trusted more than most because they’ve been deposited by reliable data sources &#8211; for example those deposited by the Nature Publishing Group that correspond to compounds in Nature Chemistry, Nature Communications and Nature Chemistry Biology articles.</p>
<p>We have developed an automated method to search PubChem for substances deposited by the Nature Publishing Group, to extract their structures and properties in sdf format and then import them into ChemSpider. The result is a <a href="http://www.chemspider.com/Search.aspx?dsn=Nature+Publishing+Group">newly imported set of 5525 molecules</a> in Chemspider. These compounds were deposited in PubChem since 2005 and originate from over 400 articles. All imported compounds link back to the original article &#8211; see below.</p>
<p><img class="size-full wp-image-1902" style="margin-left: 10px; margin-right: 200px;" title="Example Pubchem Nature compound" src="http://www.chemspider.com/blog/wp-content/uploads/2010/08/NatureExample1.jpg" alt="Example compound from PubChem" width="539" height="562" align="center" /></p>
<p>The process is automated and can be scheduled to scrape PubChem for newly deposited compounds, and stream these into ChemSpider so this subset will be updated regularly.</p>
<p>This initial prototype could pave the way for other high quality, consistently formatted subsets of PubChem to be identified and deposited into ChemSpider in a similar way. To suggest other possible subsets of PubChem which could be used by ChemSpider join the discussion on the <a href="http://forum.chemspider.com/Default.aspx?g=posts&amp;m=348Ŝ">ChemSpider forum</a>.</p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fscraping-high-quality-sets-of-moleculesfrom-pubchem.html';
  addthis_title  = 'Scraping+high+quality+sets+of+molecules+from+PubChem';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/scraping-high-quality-sets-of-moleculesfrom-pubchem.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Integrated RSC Publishing New ChemSpider Functionality at ACS Spring 2010 Part 5</title>
		<link>http://www.chemspider.com/blog/integrated-rsc-publishing-new-chemspider-functionality-at-acs-spring-2010-part-5.html</link>
		<comments>http://www.chemspider.com/blog/integrated-rsc-publishing-new-chemspider-functionality-at-acs-spring-2010-part-5.html#comments</comments>
		<pubDate>Wed, 17 Mar 2010 03:10:30 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[How ChemSpider Runs]]></category>
		<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[RSC Publishing]]></category>
		<category><![CDATA[ChemSpider]]></category>
		<category><![CDATA[Google Books]]></category>
		<category><![CDATA[Google Scholar]]></category>
		<category><![CDATA[Microsoft Academic Search]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1665</guid>
		<description><![CDATA[The functionality discussed below will be released at the ACS Spring Meeting during the week of March 21st 2010 Following on from the last post regarding integrating to RSC Databases via the RSC Publishing Beta web services layer this post expands on the nature of the integration that we have been able to introduce. The [...]]]></description>
			<content:encoded><![CDATA[<p><strong>The functionality discussed below will be released at the ACS   Spring Meeting during the week of March 21st 2010</strong></p>
<p>Following on from the <a href="http://www.chemspider.com/blog/integrated-rsc-databases-new-chemspider-functionality-at-acs-spring-2010-part-4.html">last post regarding integrating to RSC Databases</a> via the RSC Publishing Beta web services layer this post expands on the nature of the integration that we have been able to introduce. The RSC publishing beta gives us access to over 500,000 journal articles, book chapters and database records through  one simple search interface. Using a similar approach to that outlined for the RSC database searches, that of using validated synonyms as the basis of the search for chemicals, we are able to search across the entire ePlatform of articles and retrieve hits as shown below. The hits are under the RSC journals tab.</p>
<p>Since the RSC publishing platform segregates the journals from the books the same search will return results from RSC books also. Our tests show that this is incredibly fast and highly accurate. This is our first venture into tapping into the chemical compounds sitting inside the RSC archive. More work is coming&#8230;</p>
<p>If you look at the tabs below you will also see that we have integrated to Google Books, Google Scholar and the Microsoft Academic Search. We are truly integrating to available internet resources to bring together the benefits of all of the primary search engines available.</p>
<p><img class="aligncenter size-full wp-image-1667" title="eplatform" src="http://www.chemspider.com/blog/wp-content/uploads/2010/03/eplatform.png" alt="eplatform" width="644" height="445" /></p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fintegrated-rsc-publishing-new-chemspider-functionality-at-acs-spring-2010-part-5.html';
  addthis_title  = 'Integrated+RSC+Publishing+New+ChemSpider+Functionality+at+ACS+Spring+2010+Part+5';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/integrated-rsc-publishing-new-chemspider-functionality-at-acs-spring-2010-part-5.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Integrated RSC Databases New ChemSpider Functionality at ACS Spring 2010 Part 4</title>
		<link>http://www.chemspider.com/blog/integrated-rsc-databases-new-chemspider-functionality-at-acs-spring-2010-part-4.html</link>
		<comments>http://www.chemspider.com/blog/integrated-rsc-databases-new-chemspider-functionality-at-acs-spring-2010-part-4.html#comments</comments>
		<pubDate>Tue, 16 Mar 2010 19:07:45 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[ChemSpider Syntheses]]></category>
		<category><![CDATA[ChemSpider SyntheticPages]]></category>
		<category><![CDATA[How ChemSpider Runs]]></category>
		<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[RSC Publishing]]></category>
		<category><![CDATA[ChemSpider]]></category>
		<category><![CDATA[Lab Hazards Bulletin]]></category>
		<category><![CDATA[Mass Spectrometry Bulletin]]></category>
		<category><![CDATA[Methods in Organic Synthesis]]></category>
		<category><![CDATA[Natural Products Update]]></category>
		<category><![CDATA[Royal Society of Chemistry]]></category>
		<category><![CDATA[RSC]]></category>
		<category><![CDATA[RSC databases]]></category>
		<category><![CDATA[RSC Publishing Beta]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1660</guid>
		<description><![CDATA[The functionality discussed below will be released at the ACS Spring Meeting during the week of March 21st 2010 The Royal Society of Chemistry has a whole series of databases. None of them have been structure searchable&#8230;until now. As with our PubMed integration and our Google Patents integration rolling out shortly, just because a database [...]]]></description>
			<content:encoded><![CDATA[<p><strong>The functionality discussed below will be released at the ACS  Spring Meeting during the week of March 21st 2010</strong></p>
<p>The Royal Society of Chemistry has a whole series of databases. None of them have been structure searchable&#8230;until now. As with our <a href="http://www.chemspider.com/blog/new-functionality-in-the-world-of-chemspider-part-2-improved-pubmed-integration.html">PubMed integration</a> and our <a href="http://www.chemspider.com/blog/google-patents-new-chemspider-functionality-at-acs-spring-2010-part-3.html">Google Patents integration</a> rolling out shortly, just because a database hasn&#8217;t had the chemical structures extracted and indexed doesn&#8217;t mean that those resources cannot be made &#8220;structure searchable&#8221;. It&#8217;s not a subtle distinction however, as discussed in the <a href="http://www.chemspider.com/blog/google-patents-new-chemspider-functionality-at-acs-spring-2010-part-3.html">Google Patents blog post</a>. These types of integrations depend on the correct association between chemical names and structures, access to an API allowing facile and flexible searching and, something that is purely serendipitous in nature, the<strong> absence</strong> of overlaps between chemical names and common language.</p>
<p>We have used the recently announced <a href="http://pubs.rsc.org/blog/prospect/post/2010/03/09/RSC-Publishing-beta-platform-is-now-live.aspx">RSC Publishing beta platform</a> and the API made available to us to enable the searching. As my colleague Graham McCann announced recently &#8220;(the) platform gives  access to over 500,000 journal articles, book chapters and database  records through one simple search interface. The new platform delivers  faster browsing, intelligent searching and more intuitive navigation and  is open for beta testing now.&#8221;</p>
<p>Our approach has been to search the title and the abstract for each of the databases for all of the validated identifiers. It works. It is FAST and it provides &#8220;structure-related&#8221; access to all six RSC databases. An example screen shot is below where a search on chlorobenzene retrieves data on each of the following databases: <a href="http://www.rsc.org/Publishing/CurrentAwareness/msb/index.asp">Mass Spectrometry Bulletin</a>, <a href="http://www.rsc.org/publishing/currentawareness/lhb/index.asp">Laboratory Hazards Bulletin</a>, <a href="http://www.rsc.org/publishing/currentawareness/mos/index.asp">Methods in Organic Synthesis</a>, <a href="http://www.rsc.org/publishing/currentawareness/CCR/">Catalysts and Catalysed Reactions</a>, <a href="http://www.rsc.org/publishing/currentawareness/npu/">Natural Product Updates</a> and <a href="http://www.rsc.org/publishing/currentawareness/aa/index.asp">Analytical Abstracts</a>. The screen shot below shows the analytical abstracts linked by the term chlorobenzene in the title or abstract itself. 284 hits..in a fraction of a second. The abstract is linked out to the original article via DOI, where possible.</p>
<p><img class="aligncenter size-full wp-image-1662" title="databases" src="http://www.chemspider.com/blog/wp-content/uploads/2010/03/databases.png" alt="databases" width="680" height="334" /></p>
<p>My personal favorites in the set of databases are the <a href="http://www.rsc.org/publishing/currentawareness/npu/">Natural Product Updates</a> (NPU) and the <a href="http://www.rsc.org/publishing/currentawareness/mos/index.asp">Methods in Organic Synthesis</a> (MOS) databases. The NPU database contains tens of thousands of natural product chemical structures, together with chemical names, references and some physical properties. Rich resources for ChemSpider. MOS includes includes reaction schemes, title and bibliographic details. Rich resources to connect to ChemSpider SyntheticPages in the future.</p>
<p>We have only just started to tap into the riches contained within the RSC archive. It&#8217;s like stumbling across a roomful of rubies to pick up diamonds. There is content all around us waiting for us to connect. We will connect this up to ChemSpider and make it available. Access to the databases will be shown at the ACS Meeting in San Francisco.</p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fintegrated-rsc-databases-new-chemspider-functionality-at-acs-spring-2010-part-4.html';
  addthis_title  = 'Integrated+RSC+Databases+New+ChemSpider+Functionality+at+ACS+Spring+2010+Part+4';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/integrated-rsc-databases-new-chemspider-functionality-at-acs-spring-2010-part-4.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>New ChemSpider Functionality at ACS Spring 2010 Part 2 NMR Prediction</title>
		<link>http://www.chemspider.com/blog/new-chemspider-functionality-at-acs-spring-2010-part-2-nmr-prediction.html</link>
		<comments>http://www.chemspider.com/blog/new-chemspider-functionality-at-acs-spring-2010-part-2-nmr-prediction.html#comments</comments>
		<pubDate>Tue, 16 Mar 2010 16:41:03 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[Community Building]]></category>
		<category><![CDATA[How ChemSpider Runs]]></category>
		<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[ChemSpider]]></category>
		<category><![CDATA[NMR databases]]></category>
		<category><![CDATA[NMR Prediction]]></category>
		<category><![CDATA[NMRShiftDB]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1641</guid>
		<description><![CDATA[The functionality discussed below will be released at the ACS Spring Meeting during the week of March 21st 2010 We had previously released NMR prediction on ChemSpider as announced here. Based on community feedback we later removed that connection and had never reconnected, despite reported improvements. I am an NMR spectroscopist by training &#8230;if you [...]]]></description>
			<content:encoded><![CDATA[<p><strong>The functionality discussed below will be released at the ACS Spring Meeting during the week of March 21st 2010</strong></p>
<p>We had previously released NMR prediction on ChemSpider as announced <a href="http://www.chemspider.com/blog/nmr-prediction-now-available-via-chemspider.html">here</a>. Based on community <a href="http://www.chemspider.com/blog/feedback-on-nmr-prediction-on-chemspider.html">feedback</a> we later <a href="http://www.chemspider.com/blog/removal-of-nmr-prediction-from-chemspider.html">removed that connection</a> and had never reconnected, despite reported <a href="http://www.chemspider.com/blog/nmrdborg-nmr-predictor-already-improved.html">improvements</a>. I am an NMR spectroscopist by training &#8230;if you check out my <a href="http://www.mendeley.com/profiles/antony-williams/">Mendeley profile</a> you&#8217;ll see that the majority of my papers are NMR-based. Because I am an NMR jock, and despite working in cheminformatics I do keep my hands in NMR research (NMR prediction and computer assisted structure elucidation) I really wanted to make sure that we deliver NMR prediction via ChemSpider. I was involved with the development of the <a href="http://www.acdlabs.com/products/adh/nmr/nmr_pred/">ACD/Labs NMR prediction</a> tools for H1, C13, N15, F19 and P31 nuclei. There are a number of other NMR prediction modules on the market including those of Bio-Rad (in the <a href="http://www.knowitall.com/academic/welcome.asp">Know-It-All package</a>), <a href="http://www.modgraph.co.uk/product_nmr.htm">Modgraph</a> and certainly the work of <a href="http://homepage.univie.ac.at/Wolfgang.Robien/index.html">Wolfgang Robien</a>, one of the founding fathers of NMR prediction. These are primarily commercial packages.</p>
<p>In the background we have been working on the introduction of NMR prediction to ChemSpider in time for the ACS. We were looking for a platform that we could integrate that involved community deposition of data to ensure there was a growing database to enhance the prediction algorithms. We also wanted to know that the underlying data quality was good. We wanted to integrate to an Open system that had support from both an active community of participants as well as at least one developer who could provide support if we needed it. All of these criteria point to only one resource, <a href="http://www.ebi.ac.uk/nmrshiftdb/">NMRShiftDB</a>. There have been some heated discussions,<a href="http://www.chemspider.com/blog/further-comments-on-the-quality-of-nmrshiftdb-and-nmr-prediction-algorithm-validation.html"> including on this blog, regarding data quality</a>, especially in NMRShiftDB. However, I <a href="http://pubs.acs.org/doi/abs/10.1021/ci700363r">co-authored a paper</a> with Chris Steinbeck and colleagues from ACD/Labs validating the dataset as well as ACD/Labs&#8217; NMR prediction approaches.</p>
<p>NMRShiftDB is a high quality data set and certainly contains enough data to provide a training set for NMR prediction algorithms. The NMR predictions provided by NMRShiftDB are used by many people and overall feedback seems to be very positive.  Based on our previous knowledge of the data in NMRShiftDB, and the availability of a well defined programming interface to connect ChemSpider, we have worked with Stefan Kuhn at the EBI to produce a first level integration.</p>
<p>As a result at the ACS meeting in San Francisco next week we will roll out NMR prediction integration. In keeping with the new layout model we have adopted for ChemSpider using <a href="http://www.chemspider.com/blog/new-chemspider-functionality-at-acs-spring-2010-part-1-tabbed-infoboxes.html">tabbed approaches</a> for display of data, we have bundled together all predictions. The first ACD/Labs tab provides access to ACD/Labs PhysChem properties, the EPI Summary provides access to the EPISuite and the NMRShiftDB provides access to the predicted NMR spectra. The left spectrum shows the Proton NMR spectrum and the right spectrum shows the C13 NMR spectrum.</p>
<p><img class="aligncenter size-full wp-image-1642" title="NMRshiftDB" src="http://www.chemspider.com/blog/wp-content/uploads/2010/03/NMRshiftDB.png" alt="NMRshiftDB" width="471" height="251" /></p>
<p>When the system is<strong> fully </strong>integrated the process will work as follows. Since NMRShiftDB already contains many thousands of assigned spectra we will retrieve the experimentally assigned spectra directly and display them. When we cannot retrieve the experimental spectra then we will predict the NMR spectra and display them.</p>
<p>In the future we <em>might </em>pre-predict and store the NMR spectra for all structures on the NMR database. I am a little leery of doing this at present as we need to gather some basic feedback from the ChemSpider users regarding the performance of the NMR prediction algorithms and our existing implementation. In terms of predicting NMR spectra across a database of this size then a lot of consideration has to be given to domain applicability..i.e, what subset of structures should be excluded from having NMR predictions performed? For example, organometallic complexes, free radicals etc. CAS likely had to take this type of issue into account when they <a href="http://www.cas.org/newsevents/releases/protonnmr122908.html">applied NMR predictions to their CAS registry</a>.</p>
<p>If there are other NMR prediction algorithms or databases that you would be interested in integrating into ChemSpider please contact me. If you are a cheminformatics vendor selling NMR predictions/databases we would be VERY interested in receiving JUST the structures from your NMR databases. We will deposit them and link directly to your product page as an indicator that you have NMR data available.</p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fnew-chemspider-functionality-at-acs-spring-2010-part-2-nmr-prediction.html';
  addthis_title  = 'New+ChemSpider+Functionality+at+ACS+Spring+2010+Part+2+NMR+Prediction';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/new-chemspider-functionality-at-acs-spring-2010-part-2-nmr-prediction.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>How Much New Chemistry Is There In New RSC Articles</title>
		<link>http://www.chemspider.com/blog/how-much-new-chemistry-is-there-in-new-rsc-articles.html</link>
		<comments>http://www.chemspider.com/blog/how-much-new-chemistry-is-there-in-new-rsc-articles.html#comments</comments>
		<pubDate>Thu, 11 Mar 2010 19:36:55 +0000</pubDate>
		<dc:creator>Antony Williams</dc:creator>
				<category><![CDATA[Quality and Content]]></category>
		<category><![CDATA[RSC Publishing]]></category>
		<category><![CDATA[ChemSpider]]></category>
		<category><![CDATA[Project Prospect]]></category>
		<category><![CDATA[RSC]]></category>

		<guid isPermaLink="false">http://www.chemspider.com/blog/?p=1635</guid>
		<description><![CDATA[From the early days of the acquisition of ChemSpider by the RSC we have been focused on accessing the rich content that the RSC has contained in its databases and in its rich archive. We have been working hard for a number of months now to integrate systems, projects and processes into ChemSpider so that [...]]]></description>
			<content:encoded><![CDATA[<p>From the early days of the acquisition of ChemSpider by the RSC we have been focused on accessing the rich content that the RSC has contained in its databases and in its rich archive. We have been working hard for a number of months now to integrate systems, projects and processes into ChemSpider so that RSC chemistry is more discoverable. What we will be unveiling in the next few days we believe is big. We&#8217;ll roll it out one piece at a time. The<a href="http://www.chemspider.com/blog/first-deposited-compounds-from-rsc-prospect-in-chemspider.html"> last blog post</a> discussed the deposition of new compounds from RSC prospected articles into ChemSpider. The email below results from the deposition of compounds from one article. One set of 10 structures from one article that are directly deposited into ChemSpider when the article goes live. These are compounds that are deposited and live immediately, not abstracted later. Imagine when we are doing this for all RSC articles, database and books&#8230;.</p>
<p>ALL of the compounds below are NEW to the ChemSpider database&#8230;everyone of them. While not all RSC articles are only about novel compounds clearly there are new compounds moving into the database from the RSC publications.</p>
<p>Dear RSC Prospect,</p>
<p>This email is to notify that your deposition (#3427) has been published. Below please find a list of links to the structures that belong to your deposition:</p>
<p><a href="../../Chemical-Structure.23558982.html">http://www.chemspider.com/Chemical-Structure.23558982.html</a></p>
<p><a href="../../Chemical-Structure.23558983.html">http://www.chemspider.com/Chemical-Structure.23558983.html</a></p>
<p><a href="../../Chemical-Structure.23558984.html">http://www.chemspider.com/Chemical-Structure.23558984.html</a></p>
<p><a href="../../Chemical-Structure.23558985.html">http://www.chemspider.com/Chemical-Structure.23558985.html</a></p>
<p><a href="../../Chemical-Structure.23558986.html">http://www.chemspider.com/Chemical-Structure.23558986.html</a></p>
<p><a href="../../Chemical-Structure.23558987.html">http://www.chemspider.com/Chemical-Structure.23558987.html</a></p>
<p><a href="../../Chemical-Structure.23558988.html">http://www.chemspider.com/Chemical-Structure.23558988.html</a></p>
<p><a href="../../Chemical-Structure.23558989.html">http://www.chemspider.com/Chemical-Structure.23558989.html</a></p>
<p><a href="../../Chemical-Structure.23558990.html">http://www.chemspider.com/Chemical-Structure.23558990.html</a></p>
<p><a href="../../Chemical-Structure.23558991.html">http://www.chemspider.com/Chemical-Structure.23558991.html</a></p>
<p>Cheers,</p>
<p>ChemSpider</p>
<p>The structures link back directly to the RSC article via DOI as shown  below.</p>
<p><img class="aligncenter size-full wp-image-1636" title="Prospectedarticle" src="http://www.chemspider.com/blog/wp-content/uploads/2010/03/Prospectedarticle.png" alt="Prospectedarticle" width="536" height="146" /></p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fwww.chemspider.com%2Fblog%2Fhow-much-new-chemistry-is-there-in-new-rsc-articles.html';
  addthis_title  = 'How+Much+New+Chemistry+Is+There+In+New+RSC+Articles';
  addthis_pub    = '';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://www.chemspider.com/blog/how-much-new-chemistry-is-there-in-new-rsc-articles.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

