We’ve been enhancing our deposition system so that the addition of 10s of thousands of new compounds to ChemSpider doesn’t have too big an impact on the performance of ChemSpider. The deposition of every structure demands the calculation of associated properties and deduplication against the database and needed to be optimized. As a result of our improved processing we are now cleaning up our backlog of new structures, something which is well overdue we know but we didn’t want to overly stress the servers for our users. New data are now on the database from the following companies. There are more to come…

Stumble it!

3 Responses to “Another 1/2 million compounds added to ChemSpider”

  1. Egon Willighagen says:

    Antony, I guess this is not 0.5M new compounds, or? Would be interesting how many new unique compounds have been added…

  2. Antony Williams says:

    Egon…they are definitely NOT all Unique compounds. But I would say that the majority are…probably over >200,000 but this is an estimate. I can tell this by watching the streams of new CSIDs that come through:

    http://www.chemspider.com/Chemical-Structure.21351436.html
    http://www.chemspider.com/Chemical-Structure.21351437.html
    http://www.chemspider.com/Chemical-Structure.21351438.html
    http://www.chemspider.com/Chemical-Structure.21351439.html
    http://www.chemspider.com/Chemical-Structure.21351440.html
    http://www.chemspider.com/Chemical-Structure.21351441.html
    http://www.chemspider.com/Chemical-Structure.21351442.html
    http://www.chemspider.com/Chemical-Structure.21351443.html
    http://www.chemspider.com/Chemical-Structure.21351444.html
    http://www.chemspider.com/Chemical-Structure.21351445.html
    http://www.chemspider.com/Chemical-Structure.21351446.html
    http://www.chemspider.com/Chemical-Structure.21351447.html
    http://www.chemspider.com/Chemical-Structure.21351448.html
    http://www.chemspider.com/Chemical-Structure.21351449.html
    http://www.chemspider.com/Chemical-Structure.21351450.html
    http://www.chemspider.com/Chemical-Structure.21351451.html
    http://www.chemspider.com/Chemical-Structure.21351452.html
    http://www.chemspider.com/Chemical-Structure.21351453.html
    http://www.chemspider.com/Chemical-Structure.21351454.html
    http://www.chemspider.com/Chemical-Structure.21351455.html
    (the list shows only 20 the first structures)

  3. Rich Apodaca says:

    Tony, looks like Egon’s comment has disappeared. My question also related to uniqueness, quantitatively speaking. I’m wondering what story would be told by just a simple chart showing percentage of “new” compounds as a function of time for ChemSpider. Maybe even a chart showing number of duplicate compound submissions as a function of time.

    There are many ways to slice it, but it’s all pretty interesting given how new the concept of a large, public-facing chemical database is.

Leave a Reply