ACS Meeting Philadelphia, August 20th 2008

Can a Free Access Structure-Centric Community for Chemists Benefit Drug Discovery?

ChemSpider is an online database of over 20 million chemical structures assembled from well over a hundred data sources including chemical and screening library vendors, publicly accessible databases and resources, commercial databases and Open Access literature articles. Such a public resource provides a rich source of ligands for the purpose of virtual screening experiments. These can take many forms. This work will present results from two specific types of studies: 1) Quantitative Structure Activity Relationship (QSAR) based analyses and 2) In-silico docking into protein receptor sites. We will review results from the application of both approaches to a number of specific examples. QSAR analyses utilizing the ChemModLab environment for assessing quantitative structure-activity relationships will and screening using a molecular surface descriptor model.

Link to Presentation

ACS Meeting Philadelphia, August 19th 2008

Using Text-Mining and Crowdsourced Curation to Build a Structure Centric Community for Chemists

ChemSpider is a free access online structure-based community for chemists to research data and information. The database of over 20 million chemical structures and associated data has been derived from depositions by well over a hundred contributing data sources including chemical vendors, commercial database providers, web-based scraping of data and individual scientists looking to share their information with the community. Text-mining and conversion of chemical names and identifiers to chemical structures has made an enormous contribution to the availability of diverse data on ChemSpider and includes contributions from patents, open access articles and various online resources. This presentation will provide an overview of the present state of development of this important public resource and review the processes and procedures for the harvesting, deposition and curation of large datasets derived via text-mining and conversion.

Link to Presentation

Whitney Symposium, General Electric Research Center, Albany New York, June 17th 2008

Crowd Sourcing to Build a Structure Crowd-Centric Community for Chemists

Link to Presentation

American Chemical Society Meeting, April 2008, New Orleans

ChemSpider – Building a Structure Centric Community for Chemists

Scientists commonly find themselves in a state of overwhelm in regards to the availability of information accessible to them. The distribution of resources now includes the entire space of the worldwide web, access to primary databases such as CAS and, commonly, a plethora of internally developed systems. While the web has provided improved access to chemistry-related information there has not been an online central resource allowing integrated chemical structure-searching of chemistry databases, chemistry articles, patents and web pages such as blogs and wikis. ChemSpider has built a structure centric community for chemists by providing free access to an online database and collaboration tool for chemists. The online database offers an environment for curating the data on ChemSpider as well as the deposition of chemical structures, analytical data and associated information and provides a significant knowledge base and resource for chemists working in different domains. An overview of present and future capabilities will be given.

Link to Presentation

Society of Biomolecular Screening, April 2008, New Orleans

How a Structure-Centric Community for Chemists Can Benefit Drug Discovery – Virtual Screening Experiments Utilizing a Publicly Accessible Ligand Database, QSAR Modeling Tools and a Virtual Docking Software Package

ChemSpider is an online database of over 20 million chemical structures assembled from almost a hundred data sources including chemical and screening library vendors, publicly accessible databases and resources, commercial databases and Open Access literature articles. Such a public resource provides a rich source of ligands for the purpose of virtual screening experiments. These can take many forms. This work will present results from two specific types of studies: 1) Quantitative Structure Activity Relationship (QSAR) based analyses and 2) In-silico docking into protein receptor sites. We will review results from the application of both approaches to a number of specific examples using the software outlined below.

The QSAR analyses utilize the ChemModLab environment which is a free, web-based toolbox for fitting and assessing quantitative structure-activity relationships. Its elements include a cheminformatics front end to supply molecular descriptors, a set of statistical methods for fitting models, and methods for validating the resulting model. Five molecular descriptor sets are used with 16 math modeling methods to give a total of 80 QSAR models. The input is a file of compounds and a text file for biological activity.

The in-silico docking experiments are conducted using a combination QSAR/Docking approach using the SimBioSys eHITS and Lasso software programs. The docking procedure allows for the screening of a complete molecular database to obtain the correct binding poses and estimated binding affinities. The ligand based screening tool utilizes a novel conformation independent 3D QSAR descriptor, ideally suited for scaffold hopping.

Link to Presentation

Current Opinion in Drug Discovery & Development 2008 11(3) – ARTICLE IN PRESS

Public chemical compound databases

The internet has rapidly become the first port of call for all information searches. The increasing array of chemistry-related resources that are now available provides chemists with a direct path to the information that was previously accessed via library services and was limited by commercial and costly resources. The diversity of the information that can be accessed online is expanding at a dramatic rate, and the support for publicly available resources offers significant opportunities in terms of the benefits to science and society. While the data online do not generally meet the quality standards of manually curated sources, there are efforts underway to gather scientists together and ‘crowdsource’ an improvement in the quality of the available data. This review discusses the types of public compound databases that are available online and provides a series of examples. Focus is also given to the benefits and disruptions associated with the increased availability of such data and the integration of technologies to data mine this information.

Drug Discovery Today, 2008

Internet-based tools for communication and collaboration in chemistry

Drug Discovery Today, Volume 13, Numbers 11/12, June 2008 502-506, doi:10.1016/j.drudis.2008.03.015

Web-based technologies, coupled with a drive for improved communication between scientists, have resulted in the proliferation of scientific opinion, data and knowledge at an ever-increasing rate. The availability of tools to host wikis and blogs has provided the necessary building blocks for scientists with only a rudimentary understanding of computer software science to communicate to the masses. This newfound freedom has the ability to speed up research and sharing of results, develop extensive collaborations, conduct science in public, and in near-real time. The technologies supporting chemistry, while immature, are fast developing to support chemical structures and reactions, analytical data support and integration to related data sources via supporting software technologies. Communication in chemistry is already witnessing a new revolution.

Drug Discovery Today, 2008

A perspective of publicly accessible/open-access chemistry databases

Drug Discovery Today, Volume 13, Numbers 11/12, June 2008, 495-501, doi:10.1016/j.drudis.2008.03.017

The Internet has spawned access to unprecedented levels of information. For chemists the increasing number of resources they can use to access chemistry-related information provides them a valuable path to discovery of information, one which was previously limited to commercial and therefore constrained resources. The diversity of information continues to expand at a dramatic rate and, coupled with an increasing awareness for quality, curation and improved tools for focused searches, chemists are now able to find valuable information within a few seconds using a few keystrokes. This shift to publicly available resources offers great promise to the benefits of science and society yet brings with it increasing concern from commercial entities. This article will discuss the benefits and disruptions associated with an increase in publicly available scientific resources.

Feedback Form