It was only two days ago that I was talking about being green with envy about the throughput, processes and delivery of the PubChem system. And today I was informed that the deposit is done. Visiting the PubChem Data Sources page lists us as contributing 16.8 million compounds to the database. Wow.At last check 3 million had found there way to Entrez and were therefore searchable. The rest will be there very soon.

We apologize that ChemSpider has been a little slow over the past day but we have had other groups downloading the dataset. Unfortunately, our pipes aren’t fat enough to allow all of those multi-threaded downloads, plus all the calculations going on for our new depositions, plus hosting ChemSpider searches, Google InChI indexes at >30,000 per day and all the other things a modern server does to support a working team.

Fortunately for us the ChemSpider source is now on PubChem’s site and can be downloaded from there. They are used to handling high traffic and, over the years, have created a stable, scalable and performing system. We couldn’t be more honored to have our data there. And, if people want our data then they can exercise the fat pipes rather than wrestling with our download speeds.

Stumble it!

Leave a Reply