At present we are dumping a lot of new data from a series of data sources into the ChemSpider database. it is going to take a few days and during the process we will be doing a first level deduplication. Do not be surprised if you might find a couple of structures with the same InChIKey in the database. We know they are there. We will then de-duplicate afresh for the whole database.

On the home page you will see this display…keep clicking refresh or F5 and you will see the number of compounds literally growing by the second.

19 million and growing

About 10,000 structures every few minutes are now going into the database.

Our short term plans following deposition of structures is as follows.

1) Clean up synonyms using some new roboticized approaches

2) Perform 2D structure CLEANing using a newly sourced cleaning algorithm

3) Integrate new tools from new collaborators…some wonderful capabilities are  around the corner

4) Finish the  structure deposition system

5) Layer on new levels of curation as we move towards Wiki-enabling the web

6) Roll out the new ChemRefer on ChemSpider index – next rollout will over 100,000 Open Access articles.

There’s no time to be bored…

Stumble it!

Leave a Reply