ChemSpider has been working on polishing both single structure and SDF file deposition. We are now using these tried and tested approaches to deposit large blocks of data, commonly many thousands of records. For depositions of 100s of thousands we do break the depositions into smaller chunks of 5-10 thousand each.

An example of depositing a couple of large SDF files was given to us when the following publication was released at JCIM.

Global Bayesian Models for the Prioritization of Antitubercular Agents
by Philip Prathipati, Ngai Ling Ma* and Thomas H. Keller
J. Chem. Inf. Model., 2008, 48 (12), pp 2362–2370
DOI: 10.1021/ci800143n

This paper offers us a few thousand SMILES strings in CSV files that we could deposit into ChemSpider and associate with the article.Visit n example here and you will see the article connected via DOI in the supplementary information.


It is easy for us to deposit such datasets so if you have publications with such datasets that you would like to see on ChemSpider send us the SDF file and the DOI and they will be deposited.

Reblog this post [with Zemanta]
Stumble it!

Leave a Reply