American Chemical Society Meeting, April 2008, New Orleans
ChemSpider – Building a Structure Centric Community for Chemists
Scientists commonly find themselves in a state of overwhelm in regards to the availability of information accessible to them. The
distribution of resources now includes the entire space of the worldwide web, access to primary databases such as CAS and, commonly,
a plethora of internally developed systems. While the web has provided improved access to chemistry-related information there has
not been an online central resource allowing integrated chemical structure-searching of chemistry databases, chemistry articles, patents
and web pages such as blogs and wikis. ChemSpider has built a structure centric community for chemists by providing free access to
an online database and collaboration tool for chemists. The online database offers an environment for curating the data on ChemSpider
as well as the deposition of chemical structures, analytical data and associated information and provides a significant knowledge
base and resource for chemists working in different domains. An overview of present and future capabilities will be given.
Link to Presentation
Society of Biomolecular Screening, April 2008, New Orleans
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - Virtual Screening Experiments Utilizing a Publicly Accessible
Ligand Database, QSAR Modeling Tools and a Virtual Docking Software Package
ChemSpider is an online database of over 20 million chemical structures assembled from almost a hundred data sources including chemical
and screening library vendors, publicly accessible databases and resources, commercial databases and Open Access literature articles.
Such a public resource provides a rich source of ligands for the purpose of virtual screening experiments. These can take many forms.
This work will present results from two specific types of studies: 1) Quantitative Structure Activity Relationship (QSAR) based analyses
and 2) In-silico docking into protein receptor sites. We will review results from the application of both approaches to a number of
specific examples using the software outlined below.
The QSAR analyses utilize the ChemModLab environment which is a free, web-based toolbox for fitting and assessing quantitative structure-activity
relationships. Its elements include a cheminformatics front end to supply molecular descriptors, a set of statistical methods for
fitting models, and methods for validating the resulting model. Five molecular descriptor sets are used with 16 math modeling methods
to give a total of 80 QSAR models. The input is a file of compounds and a text file for biological activity.
The in-silico docking experiments are conducted using a combination QSAR/Docking approach using the SimBioSys eHITS and Lasso software
programs. The docking procedure allows for the screening of a complete molecular database to obtain the correct binding poses and
estimated binding affinities. The ligand based screening tool utilizes a novel conformation independent 3D QSAR descriptor, ideally
suited for scaffold hopping.
Link to Presentation
Current Opinion in Drug Discovery & Development 2008 11(3) - ARTICLE IN PRESS
Public chemical compound databases
The internet has rapidly become the first port of call for all information searches. The increasing array of chemistry-related resources
that are now available provides chemists with a direct path to the information that was previously accessed via library services and
was limited by commercial and costly resources. The diversity of the information that can be accessed online is expanding at a dramatic
rate, and the support for publicly available resources offers significant opportunities in terms of the benefits to science and society.
While the data online do not generally meet the quality standards of manually curated sources, there are efforts underway to gather
scientists together and ‘crowdsource’ an improvement in the quality of the available data. This review discusses the types of public
compound databases that are available online and provides a series of examples. Focus is also given to the benefits and disruptions
associated with the increased availability of such data and the integration of technologies to data mine this information.
Drug Discovery Today, 2008 - ARTICLE IN PRESS
Internet-based tools for communication and collaboration in chemistry
Web-based technologies, coupled with a drive for improved communication between scientists, have resulted in the proliferation of
scientific opinion, data and knowledge at an ever-increasing rate. The availability of tools to host wikis and blogs has provided
the necessary building blocks for scientists with only a rudimentary understanding of computer software science to communicate to
the masses. This newfound freedom has the ability to speed up research and sharing of results, develop extensive collaborations, conduct
science in public, and in near-real time. The technologies supporting chemistry, while immature, are fast developing to support chemical
structures and reactions, analytical data support and integration to related data sources via supporting software technologies. Communication
in chemistry is already witnessing a new revolution.
Drug Discovery Today, 2008 - ARTICLE IN PRESS
A perspective of publicly accessible/open-access chemistry databases
The Internet has spawned access to unprecedented levels of information. For chemists the increasing number of resources they can use
to access chemistry-related information provides them a valuable path to discovery of information, one which was previously limited
to commercial and therefore constrained resources. The diversity of information continues to expand at a dramatic rate and, coupled
with an increasing awareness for quality, curation and improved tools for focused searches, chemists are now able to find valuable
information within a few seconds using a few keystrokes. This shift to publicly available resources offers great promise to the benefits
of science and society yet brings with it increasing concern from commercial entities. This article will discuss the benefits and
disruptions associated with an increase in publicly available scientific resources.