As ChemSpider has grown into an important part of the online community for providing access to information and data to chemists to assist them in their work there are many subjective criteria by which to be measured. We set some objectives early on in regards to how we would measure our own successes in the first couple of years. These included:

1) A result of >500,000 in a Google search (we have been at this number for over a month I believe)

2) Acknowledgment by our “peers”, another subjective criterion, by comments made in the blogosphere, recognized by invitations to speak, participate in panel discussions etc. No shortage here.

3) Reach 5000 unique users per day in our first year (already achieved)

4) Be reviewed in a mainstream publication (the Nature article written about ChemSpider does that)

5) Have over 150 data sources feed ChemSpider. We are close…145 data sources at present and more in the pipe to feed in shortly

6) Be indexed by Chemical Abstracts Service.

CAS has been indexing a number of web resources for a considerable time. Until today I didn’t know that we were one of these sources. It actually makes a lot of sense that we should be indexed. We have unique chemistry on our site since we host Open Notebook Science from groups such as that of Jean-Claude Bradley at Drexel University. But, we also have spectra and assignments from research compounds being deposited onto the database and are establishing relationships with Open Access publishers to index their chemical compounds connected directly to their articles. So, being indexed makes sense.

There has been a murmuring in the community that what ChemSpider is doing will collide with CAS. I have reiterated many times that I believe CAS offers the crown jewels in terms of quality and curated data. With what amounts to likely 1000s of person years of investment in building the registry we are unlikely to surpass CAS’ breadth of knowledge. Rather we are focused on providing a service to the community so that the community can participate in developing and growing the databas. I believe CAS and ChemSpider are synergistic and have much to offer by being connected in this way.

Inserted above is a screen grab of part of a record showing the ChemSpider database as the source of the structure. CAS have rigorous expectations regarding how they select what chemical entities should be inserted into their database. While I don’t know this list of definitions this structure clearly meets it. The structure above is on ChemSpider here. We’re very happy that we are being indexed now in the CAS registry and will continue to enhance our “unique structure collection” working with chemical vendors, publishers and scientists to grow our database.

 

Stumble it!

7 Responses to “Chemical Abstracts Service Indexes ChemSpider Content”

  1. j says:

    but note that while SciFinder has the structure, there is no identifiable source. A link to ChemSpider would be useful. I’d been wondering where all the structures turning up in searches with “registry, no references” were from, perhaps this is it.

  2. Antony Williams says:

    J comments “while SciFinder has the structure, there is no identifiable source. A link to ChemSpider would be useful. I’d been wondering where all the structures turning up in searches with “registry, no references” were from, perhaps this is it.”

    The source is listed in Scifinder. In the example above it declares the Database as ChemSPider (ChemZoo, Inc.). J, might you be suggesting that a link be made from Scifinder to ChemSpider via ChemSPider ID? If so that is easy to do..and we would support it. We have recently made a number of depositions and use deduplicating processes at deposition but we are about to run the engine across the database again to ensure that our processes have not missed any duplicates. Then deduplication will be complete and such connections could be made.

    In terms of “registry, no references” I think this points to having structures in the registry but with no articles associated..maybe this is your point. Yes, I can agree that will be an issue. At the ACS meeting in Philadelphia a comment was made about indexing multiple web sources at this time and many will not have references for sure, especially if they involve indexing the chemical vendor depositions on PubChem, ChemSpider, emolecules etc.

  3. Joerg Kurt Wegner says:

    The first thing coming to mind after reading this was a German quote. Its saying something like ‘If the mountain does not come to the prophet, the prophet has to go to the mountain” [Unknown]

    Or in this context, ‘If you can not index CAS, let CAS index you’ [Joerg Kurt Wegner]

    I guess this is the only way to go, since they have this restrictive licensing on CAS indices. Though, I am still puzzled about it, because I would like to know how many journal articles contain CAS numbers, and if this ‘public’ CAS source is not violating their licensing?

    Anyway, I found also another good quote on WikiQuote
    ‘Wonder is not a Pollyanna stance, not a denial of reality; wonder is an acknowledgment of the power of the mind to transform’ [Christina Baldwin]
    http://en.wikiquote.org/wiki/Wonder

  4. will says:

    The lack of a link can be explained as linking requires some sort of system to remove broken links over time, as websites often change url structures.

    The lack of a ChemSpider ID is more interesting as it shows they view it as a possible future competitor to the CAS No.

    Though ChemSpider has undeniably been indexed by CAS here, it is the most minimal form of indexing possible i.e. the metadata given only amounts to “this is in ChemSpider… somewhere”.

    This is not helpful to CAS curators or users of their DBs.

    If they revisit the record later, what if they discover multiple ChemSpider IDs that could fit this CAS record? How will they know which record the curator was referring to initally? How will the user know?

  5. j says:

    STN might list ChemSpider, but SciFinder does not:

    Registry Number: 1027664-06-7
    Formula: C16 H16 Cl2 N2 O3
    CA Index Name: Isoxazole, 5-[3-[2,6-dichloro-4-(2,5-dihydro-2-oxazolyl)phenoxy]propyl]-3-methyl-
    References: None
    Database: REGISTRY

  6. Paul Schulwitz says:

    I find it interesting that there is no way to search for “ChemSpider” records in the CAS Registry file. I was able to extract the 1287203 records labeled as “Other Sources” in the source(/SR) field but could not think of a way to find out exactly how many of those are ChemSpider records. I found a few ChemSpider compounds just by randomly displaying records in the set. No ChemSpider ID # is given or any further elaboration on the data source in the Registry records. I ended up copying and pasting the CA Index Name in to ChemSpider to find references. In the past I came across at least one Surechem patent and was hoping to find another with no luck. I’m just trying to figure out if the patents found in Surechem have even been indexed by CA and if so, then why did CA choose to not index the compounds found in Surechem.

    It’s good to know that ChemSpider can quickly and correctly derive structures from CA Index names. I’ll have to remember to link the CA Registry Number in ChemSpider whenever I come across one in Registry. I just did this for ChemSpider ID: 9613042

  7. Alain Borel says:

    In case anybody’s interested, here’s a link to my recent announcement on the CHEMINF mailing list (shameless advertising):

    http://preview.tinyurl.com/56fct8

    It’s about a browser plugin that modifies Scinfinder Web’s pages dynamically to add more links. Chemspider is one target, I guess it might be useful in this context.

Leave a Reply